GenCore version 5.1.6 
Copyright (c) 1993 - 2003 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 
Run on : 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



December 13, 2003, 04:37:24 ; Search time 1684.26 Seconds 

(without alignments) 
7460.491 Million cell updates/sec 

US-09-852-261-1 
517 

1 ggaccggagacgctctgcgg tgaaatacacaagtaaacat 517 

I DENT I T Y_NUC 

Gapop 10.0 , Gapext 1.0 



Searched: 22781392 seqs, 12152238056 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



45562784 



Post-processing : 



Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 



Database : 



EST: 



1 
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7 
8 
9 

10 
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12 
13 
14 
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16 
17 
18 
19 
20 
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22 
23 
24 
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26 
27 



em_estba : * 
em_esthum: * 
em_estin : * 
em_estmu : * 
em_es tov : * 
em_estpl : * 
em_estro : * 
em_htc: * 
gb_estl : * 
gb_est2 : * 
gb_htc: * 
gb_est3 : * 
gb_est4 : * 
gb_est5 : * 
em_estfun : * 
em_estom: * 
em_gs s__hum : * 
em_gss_inv: * 
em_gss_pln : * 
em_gss_vrt : * 
em_gss_f un : * 
em_gs s_mam : * 
em_gss_mus : * 
em_gss_pro : * 
em_gss_rod: * 
em_gss_phg: * 
em_gss_vrl : * 



28: gb_gssl:* 
29: gb_gss2:* 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query 





No . 


Score 


Match 


Length 


DB 


ID 


Description 




1 


344.2 


66. 


6 


796 


14 


CB959991 


CB959991 AGENCOURT 


c 


2 


331. 6 


64. 


1 


558 


9 


AI503976 


AI503976 vm43d08.x 


c 


3 


330. 6 


63. 


9 


673 


12 


BM984 67 0 


BM984670 UI-CF-EC1 


c 


4 


329. 8 


63. 


8 


623 


9 


AW146128 


AW146128 um37el0.x 


c 


5 


326. 6 


63. 


2 


575 


9 


AI248089 


AI248089 qh69f05.x 


c 


6 


316.6 


61. 


2 


549 


9 


AI169253 


AI169253 EST215088 


c 


7 


315 . 8 


61. 


1 


558 


9 


AI265629 


AI265629 uj04b07.x 


c 


8 


314 . 8 


60. 


9 


498 


9 


AA542914 


AA542914 ni98cl0.s 




9 


310 


60. 


0 


614 


14 


CD373004 


CD373004 UI-R-GR0- 




10 


309 


59. 


8 


816 


9 


AI119218 


AI119218 ue94h02.y 




11 


303 . 6 


58. 


7 


594 


10 


BF383724 


BF383724 602044632 


c 


12 


299.8 


58 . 


0 


527 


9 


AA913900 


AA913900 ol35g05.s 


c 


13 


289.6 


56. 


0 


642 


9 


AI876493 


AI876493 uj59bl0.x 


c 


14 


287.4 


55. 


6 


499 


9 


AW495481 


AW495481 UI-M-BH3- 


c 


15 


276 


53. 


4 


468 


9 


AI169770 


AI169770 EST215669 




16 


274 . 4 


53. 


1 


882 


9 


AI604642 


AI604642 vm43d08.y 


c 


17 


268.2 


51. 


9 


430 


9 


AI478804 


AI478804 tm52e04 . x 


c 


18 


263 . 2 


50. 


9 


653 


13 


BQ200567 


BQ200567 UI-R-DZ1- 




19 


258 . 4 


50. 


0 


608 


9 


AL599807 


AL599807 DKFZp3130 


c 


20 


254 . 6 


49. 


2 


486 


9 


AA993659 


AA993659 ot85gll.s 


c 


2 1 


254 . 2 


49. 


2 


521 


9 


AW493459 


AW493459 UI-M-BH3- 




22 


254 . 2 


49. 


2 


559 


12 


BI715603 


BI715603 ic34hl0.y 




23 


254.2 


49. 


2 


602 


13 


BU590710 


BUS 9 07 10 AGENCOURT 




24 


254 . 2 


49. 


2 


621 


12 


BI221656 


BI221656 602936980 




25 


254.2 


49. 


2 


1658 


11 


AK081019 


AK081019 Mus muscu 




Z D 




49. 


1 


o t c 

356 


y 


AWz 9/586 


TV T.T O f \ "I C O TTT TT T"»T.7 f\ 

AW2 9758 6 UI-H-BWO- 


c 


Z 1 


o c o o 

2 53 . 2 


49. 


0 


595 


9 


AI 573421 


AI573421 mo04bll.x 


c 


Z o 


ZdZ . b 


48. 


9 


4 99 


12 


BI6 /6839 


BI676839 ic56a08.x 


c 


29 


252 . 6 


48. 


9 


500 


9 


AA945553 


AA945553 EST201052 


c 


30 


252. 6 


48. 


9 


525 


9 


AA963258 


AA963258 UI-R-El-g 




31 


251.4 


48. 


6 


482 


9 


AA456717 


AA456717 aal3h06.r 


c 


32 


251 


48. 


5 


706 


9 


AI401719 


AI401719 th30bl0.x 


c 


33 


249.4 


48. 


2 


525 


9 


AI599751 


AI599751 EST251454 




34 


248.6 


48. 


1 


665 


9 


AA690767 


AA690767 vu57dl2.r 




35 


247. 8 


47 . 


9 


559 


12 


BI715465 


BI715465 ic33b09.y 




36 


247.4 


47 . 


9 


799 


9 


AI314558 


AI314558 uj48d07.y 


c 


37 


247.2 


47. 


8 


499 


12 


BI294072 


BI294072 UI-R-DK0- 


c 


38 


244.2 


47. 


2 


502 


9 


AI104669 


AI104669 EST213958 


c 


39 


243 


47. 


0 


561 


12 


BI714874 


BI714874 ic33b09.x 


c 


40 


240. 6 


46. 


5 


564 


12 


BI714981 


BI714981 ic34hl0.x 




41 


239.2 


46. 


3 


2170 


11 


AK038119 


AK038119 Mus muscu 




42 


237.4 


45. 


9 


558 


12 


BI715475 


BI715475 ic33c08.y 


c 


43 


237.2 


45. 


9 


480 


9 


AA621551 


AA621551 af47cl0.s 




44 


236.8 


45. 


8 


512 


9 


AI876203 


AI876203 uj59bl0.y 




45 


234.2 


45. 


3 


949 


14 


CB589117 


CBS 8 91 17 AGENCOURT 



ALIGNMENTS 



RESULT 1 
CB959991 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



CB959991 796 bp mRNA linear EST 29-APR-2003 

AGENCOURT_13888044 NIH_MGC_147 Homo sapiens cDNA clone 
IMAGE: 30341081 5', mRNA sequence. 
CB959991 

CB959991. 1 GI : 30216107 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 796) 

NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished 

Contact: Robert Strausberg, Ph.D. 

Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Dr. Stefan Hansson 
cDNA Library Preparation: Michael J. Browns tein (NHGRI) with help 

and advice from Piero Carninci (RIKEN) 
cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 
DNA Sequencing by: Agencourt Bioscience Corporation 
Clone distribution: MGC clone distribution information can be 

found through the I.M.A.G.E. Consortium/LLNL at: 

http : / / image . llnl . gov 

Plate: NDAM371 row: p column: 1 18 

High quality sequence stop: 707. 
Location/ Qualifiers 
1. .796 

/organisra="Homo sapiens" 
/mol_type= "mRNA" 
/db_xref="taxon: 9606" 
/clones "IMAGE: 30341081" 
/tissue_type= "Human Placenta" 
/lab_host="DH10B TonA" 
/clone_lib= ,, NIH_MGC_147" 

/note="Organ: placenta; Vector: pBluescriptR; Site_l: 
all-XhoI; Site_2 : BamH; Oligo-dT primed using primer 
5 i -TTTTTTTTTTTTTTTTVN-3 1 , size-selected for average 
insert size 2.3 kb and normalized to ROT 5. This is a 
primary library enriched for full-length clones and 
constructed using the Cap- trapper method (Carninci, in 
preparation) . Library constructed by M. Brownstein 
(NIMH/NHGRI, National Institutes of Health) . Note: This is 
a NIHJVtGC library." 
224 a 197 c 191 g 184 t 



Query Match 66.6%; Score 344.2; DB 14; Length 796; 

Best Local Similarity 87.3%; Pred. No. 8.3e-81; 

Matches 455; Conservative 0; Mismatches 13; Indels 53; Gaps 5; 



Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 180 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 239 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I M M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I 
Db 24 0 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 2 99 

Qy 121 AC AGGC AT C GT GGAT GAGT GCTGCTTC C GGAGCT GT GAT CTAAGGAGGCT GGAGAT GT AT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 300 ACAGGCAT CGT GGAT GAGT GCT GCTT CCGGAGCT GT GAT CTAAGGAGGCT GGAGAT GTAT 359 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 24 0 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 360 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 419 

Qy 241 AT GCC CAAGACCCAGAAGTAT CAGCCCCCATCTACCAACAAGAACACGAAGTCT CAGAGA 300 

I I I I I I I I I I I I I I I 

Db 42 0 AT GCC C AAGAC C C AG — 434 

Qy 301 AGGAAAGGAAGTACAT T T GAAGAAC ACAAGT AGAGG GAGT GCAGGAAACAAGAACTAC AG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 435 AAGGAAGTACAT T T GAAGAAC GCAAGTAGAGGGAGT GCAGGAAACAAGAACTAC AG 4 90 

Qy 361 GAT GT A- GAAGACCCT T CT GAGGAGT GAAGAAGGAC AGGC CAC C GC AGGACCCT T T GCT C 419 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 4 91 GAT GT AGGAAGACCCT C CT GAGGAGT GAAGAGT GAC AT GC CAC C GC AGGAT CCT T T GCT C 550 

Qy 42 0 T GCAC - AGT T ACCT G- T AAACATT GGAAT AC CGGC CAAAAAAT AAGT T T GAT C ACATT T C 477 

I I I I I II I I I I I I I I I I I I Mill Ml I I I I I I I I I I I I I I I I I I I I I I I I 
Db 551 T GCAC GAGT T ACCT GT TAAACT TT GGAAC AC CT AC CAAAAAAT AAGTT T GATAAC AT T T A 610 

Qy 478 AAAGAT - GGC ATTT CC C C CAAT GAAAT AC ACAAGTAAACAT 517 

I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 611 AAAGAT GGGC GTTT CC CC CAAT GAAAT AC ACAAGTAAACAT 651 



RESULT 2 

AI503976/C 

LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



AI503976 558 bp mRNA linear EST ll-MAR-1999 

vm43d08.xl Stratagene mouse diaphragm (#937303) Mus musculus cDNA 
clone IMAGE: 1001007 3' similar to gb:X04482 Mouse mRNA for 
preproinsulin-like growth factor IB (MOUSE);, mRNA sequence. 
AI503976 

AI503976.1 GI:4401827 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 558) 

Marra,M., Hillier,L., Kucaba,T. f Martin, J., Beck,C, Wylie,T., 
Underwood, K. , Steptoe,M. , Theising,B., Allen, M. , Bowers, Y. , Person 
,B., Swaller,T., Gibbons, M. , Pape,D., Harvey, N . , Schurk,R., Ritter 
,E., Kohn,S., Shin,T., Jackson, Y., Cardenas, M., McCann,R., 
Waterston,R. and Wilson, R. 



TITLE 
JOURNAL 
COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



The WashU-NCI Mouse EST Project 1999 
Unpublished 

Contact: Marra M/WashU-NCI Mouse EST Project 1999 
Washington University School of Medicine 

4444 Forest Park Parkway, Box 8501, St. Louis, MO 63108, USA 
Tel: 314 286 1800 
Fax: 314 286 1810 

Email: mouseest@watson.wustl.edu 

This clone is available royalty-free through LLNL ; contact the 
IMAGE Consortium (info@image.llnl.gov) for further information. 
MGI: 565223 

This clone was previously sequenced on the 5' end only, this new 

data is from the 3* end 

High quality sequence stop: 440. 

Location/Qualifiers 

1. .558 

/organism="Mus mus cuius" 
/mol_type="mRNA" 
/db_xre f = " t axon : 1 0 0 9 0 " 
/clone="IMAGE: 1001007" 
/tissue_type= "diaphragm" 
/dev_stage-"adult" 

/lab_host="SOLR (kanamycin resistant) " 
/clone_lib="Stratagene mouse diaphragm (#937303)" 
/note="0rgan: diaphragm; Vector: pBluescript SK-; Site_l: 
EcoRI ; Site_2 : Xhol; Cloned unidirectionally from mRNA 
prepared from diaphragm muscle. Primer: Oligo dT. Average 
insert size: 1.5 kb. Uni-ZAP XR Vector; -5' adaptor 
sequence: 5 1 GAATT C GG C AC GAG 3 f ~3' adaptor sequence: 5 f 
CTCGAGTTTTTTTTTTTTTTTTTT 3 1 " 
103 a 133 c 149 g 173 t 



Query Match 64.1%; 
Best Local Similarity 82.0%; 
Matches 4 33; Conservative 



Score 331.6; DB 9; 
Pred. No. 1.7e-77; 
0; Mismatches 84; 



Length 558; 
Indels 11; 



Gaps 



4; 



Qy 



Db 



1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 
I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
530 GGACCAGAGACCCTTTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGACCG 471 



Qy 



Db 



61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
47 0 AGGGGCTTTTACTTCAACAAGCCCACAGGCTATGGCTCCAGCATTCGGAGGGCACCTCAG 411 



Qy 

Db 

Qy 

Db 

Qy 

Db 



121 AC AG GCAT C GT GGAT GAGT GCT GCTT CCGGAGCT GT GAT CTAAG GAGGCT GGAGAT GT AT 180 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

410 ACAGGCATTGT GGAT GAGT GTT GCTT CCGGAGCT GT GAT CT GAGGAGACT GGAGAT GT AC 351 

181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

II II II II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

350 TGTGCCCCACTGAAGCCTACAAAAGCAGCCCGCTCTATCCGTGCCCAGCGCCACACTGAC 2 91 

241 AT GCCCAAGACCCAGAAGTATCAGCCCCCAT CTACCAACAAGAACACGAAGT CTCA G 297 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I 
290 ATGCCCAAGACTCAGAAGTCCCCGTCCCTATCGACAAACAAGAAAACGAAGCTGCAAAGG 231 



Qy 298 AGAAGGAAAG GAAGT ACATTT GAAGAAC ACAAGT AGAGGGAGT GCAGGAAACAAGAACT A 357 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I II I I I III 

Db 230 AGAAGGAAAGGAAGT AC AT T T GAAGAACACAAGTAGAGGAAGTGCAGGAAACAAGAC CT A 171 

Qy 358 CAGGATGTA-GAAGACCCTTCTGAGGAGTGAAGAAGGACAGGCCACCGCAGGACCCTTTG 416 

III I I I I I II II III I I I I I I II I II I I I II I I I I I I I I I I I 

Db 170 CAGAATGTAGGAGGAGCCTCCCACGGAGCAGAAAATGCCACATCACCGCAGGATCCTTTG 111 

Qy 417 C T CT GC ACAGT T AC CT GTAAACAT T GGAAT ACCGGCCA AAAAAT AAGT T T GAT C 470 

II I I I I I I I I I I I I I III I I I I I I I I I II 

Db 110 CT GCTT GAGCAACCT GCAAAACAT CGAAACACCTACCAAATAACAATAATAAGTCCAATA 51 

Qy 471 AC AT T T CAAAGAT - GGCAT TT C C C C CAAT GAAAT ACACAAGTAAAC AT 517 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 50 AC AT T ACAAAGAT GGGCATT T C CC CCAAT GAAAT AT ACAAGT AAACAT 3 



RESULT 3 

BM984670/c 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
COMMENT 



BM984670 



linear EST 20-FEB-2003 



FEATURES 

source 



673 bp mRNA 

UI-CF-ECl-abj-k-24-0-UI.sl UI-CF-EC1 Homo sapiens cDNA clone 

UI-CF-ECl-abj-k-24-0-UI 3 f , mRNA sequence. 

BM984670 

BM98 4 67 0. 1 GI: 19610417 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 673) 

Bonaldo,M. F. , Lennon,G. and Soares, M.B. 

Normalization and subtraction: two approaches to facilitate gene 
discovery 

Genome Res. 6 (9), 791-806 (1996) 

97044477 

8889548 

Contact: McCray, PB 
McCray Lab 
University of Iowa 

2024 University of Iowa Med Labs, Iowa City, IA 52242, USA 

Tel: 319 356 4866 

Fax: 319 356 7171 

Email: paul-mccray@uiowa.edu 

Tissue Procurement: Dr. M. J. Welsh, University of Iowa 
cDNA Library preparation: Dr. M, Bento Soares, University of Iowa 
cDNA Library Arrayed by: Dr. M. Bento Soares, University of Iowa 
DNA Sequencing by: Dr. M. Bento Soares, University of Iowa 
Clone Distribution: Researchers may obtain clones from Research 

Genetics (www.resgen.com) or from Open Biosystems 

(www.openbiosystems.com) . 

Seq primer: M13 FORWARD 

POLYA-Yes . 

Location/ Quali f iers 
1. .673 

/organism="Homo sapiens" 

/mol_type="mRNA" 

/db xref="taxon:9606" 



BASE COUNT 
ORIGIN 



/clone= ,, UI-CF-ECl-abj-k-24-0-UI" 
/tissue_type="Lung" 
/devjstage= "Adult and Fetal" 

/lab_host="DH10B (Life Technologies) (Tl phage resistant)" 
/clone_lib="UI-CF-ECl M 

/note="Organ : Lung; Vector: pT7T3-Pac (Pharmacia) with a 
modified polylinker; Site_l: EcoR I; Site_2: Not I; 
UI-CF-EC1 is a normalized cDNA library containing the 
following tissue (s): Normal lung from adult and from fetal 
day 64, day 87, week 19 and week 42. The library was 
constructed according to Bonaldo, Lennon and Soares, 
Genome Research, 6:791-806, 1996. First strand cDNA 
synthesis was primed with an oligo-dT primer containing a 
Not I site. Double stranded cDNA was ligated to an EcoR I 
adaptor, digested with Not I, and cloned directionally 
into pT7T3-Pac vector. The oligonucleotide used to prime 
the synthesis of first-strand cDNA contains a library tag 
sequence that is located between the Not I site and the 
(dT) 18 tail. The sequence tag for this library is 
AAGTGCTTAC . 
T AG__L I B=U I - C F- EC 1 

T AG_T I S S U E=N o rma 1 Lung Epithelial Cells Tissue nos 369-371 
and 380-383 
TAG_S EQ=AAGT GCTTAC " 
152 a 164 c 169 g 188 t 



Query Match 63.9%; 
Best Local Similarity 86.9%; 
Matches 453; Conservative 



Score 330.6; DB 12; Length 673; 
Pred. No. 3.3e-77; 
0; Mismatches 14; Indels 54; 



Gaps 



6; 



Qy 



Db 



1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I II I I I I I I I I II II I II I I I I I I I 
492 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 4 33 



Qy 



Db 



61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 12 0 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
432 AGGGG-TTTTATTTCAGCAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 374 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



121 ACAGGCAT CGTGGAT GAGT GCT GCTT CCGGAGCT GT GATCTAAGGAGGCT GGAGATGT AT 180 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
37 3 AC AGGC AT CGT GGAT GAGT GCT GCTT C C GGAGCT GT GAT CTAAGGAGGCT G GAGAT GT AT 314 



181 



240 



TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 
I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
313 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 254 



241 



300 



AT GCCCAAGACCCAGAAGTATCAGCCCCCATCTAC CAACAAGAACACGAAGT CT CAGAGA 
I I I I I I I I I I I I I I I 

253 ATGCCCAAGACCCAG 239 

301 AGGAAAGGAAGTACATTTGAAGAACACAAGTAGAGGGAGTGCAGGAAACAAGAACTACAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
238 AAGGAAGTACATTTGAAGAACGCAAGTAGAGGGAGTGCAGGAAACAAGAACTACAG 183 



Qy 361 GAT GT A- GAAGACCCTT CTGAGGAGT GAAGAAGGACAGGCCACCGCAGGACCCTTTGCT C 419 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 



182 GAT GT AGGAAGACCCT C CT GAGGAGT GAAGAGT GAC AT GC C AC C GC AGGAT CCTTTGCTC 123 



Qy 420 T G C AC - AGTT AC CTG- T AAACAT T GGAAT ACCG GC CAAAAAAT AAGT T T GAT CACAT T TC 477 

I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I 
Db 122 T GCAC GAGT T AC CT GT T AAACTT T GGAAC AC CT AC CAAAAAAT AAGT TT GAT AACATT T A 63 

Qy 478 AAAGAT - G GCATTT C C CCCAAT GAAAT ACACAAGTAAACAT 517 

I I I I I I III I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 62 AAAGAT GGGCGTTTCCCC C AAT GAAAT AC AC AAGT AAACAT 22 



RESULT 4 

AW146128/C 

LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 
JOURNAL 
COMMENT 



FEATURES 

source 



AW146128 623 bp mRNA linear EST 10-OCT-2000 

um37el0.xl Sugano mouse embryo mewa Mus musculus cDNA clone 
IMAGE:2247498 3 1 similar to gb:X04482 Mouse mRNA for 
preproinsulin-like growth factor IB (MOUSE);, mRNA sequence. 
AW146128 

AW146128. 1 GI: 6167864 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 623) 

Marra,M., Hillier,L., Kucaba,T., Martin, J., Beck,C, Wylie,T., 

Underwood, K. , Steptoe,M. , Theising,B., Allen, M. , Bowers, Y., Person 

, B., Swaller,T., Gibbons, M., Pape,D., Harvey, N. , Schurk,R., Ritter 

,E. f Kohn,S., Shin,T., Jackson, Y., Cardenas, M. , McCann,R., 

Waterston,R. and Wilson, R. 

The WashU-NCI Mouse EST Project 1999 

Unpublished 

Contact: Marra M/WashU-NCI Mouse EST Project 1999 
Washington University School of Medicine 

4444 Forest Park Parkway, Box 8501, St. Louis, MO 63108, USA 

Tel: 314 286 1800 

Fax: 314 286 1810 

Email: mouseest@watson.wustl.edu 

This clone is available royalty-free through LLNL ; contact the 
IMAGE Consortium (info@image.llnl.gov) for further information. 
MGI: 1006958 

Seq primer: custom primer used 
High quality sequence stop: 4 99. 

Location/Qualifiers 

1. .623 

/organism="Mus musculus" 
/mol_type="mRNA" 
/strain="C57BL" 
/db_xref="taxon: 10090" 
/clone="IMAGE: 2247498" 
/dev_stage="embryo, 14 dpc" 
/lab_host="DH10B" 

/clone_lib="Sugano mouse embryo mewa" 

/note="Vector: pME18S-FL3; Site_l: Drain (CACTGTGTG) ; 
Site_2: Dralll (CACCATGTG) ; 1st strand cDNA was primed 
with an oligo(dT) primer [ATGTGGCCTTTTTTTTTTTTTTTTT] ; 
double-stranded cDNA was ligated to a Dralll adaptor 



BASE COUNT 
ORIGIN 



[TGTTGGCCTACTGG] , digested and cloned into distinct Drain 
sites of the pME18S-FL3 vector (5 f site CACTGTGTG, 3' site 
CACCATGTG) . Xhol should be used to isolate the cDNA 
insert. Size selection was performed to exclude fragments 
<1.5kb. Library constructed by Dr. Sumio Sugano 
(University of Tokyo Institute of Medical Science) . 
Custom primers for sequencing: 5' end primer 
CTTCTGCTCTAAAAGCTGCG and 3' end primer 
CGACCTGCAGCTCGAGCACA. " 
123 a 138 c 170 g 191 t 1 others 



Query Match 63.8%; Score 329.8; DB 9; Length 623; 

Best Local Similarity 80.6%; Pred. No. 5.3e-77; 

Matches 425; Conservative 0; Mismatches 92; Indels 10; 



Gaps 



3; 



Qy 



Db 



1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 
I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I II 
541 GGACCAGAGACCCTTTTCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGACCG 4 82 



Qy 



Db 



61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
481 AGGGGCTTTTACTTCAACAAGCCCACAGGCTATGGCTCCAGCATTCGGAGGGCACCTCAG 422 



Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



121 AC AGGCAT CGT G GAT GAGT GCT G CTT C CGGAGCT GT GAT CT AAGGAGGCT GGAGAT GT AT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 

421 AC AG GCATT GT GGAT GAGT GTT GCTT C CGGAGCT GT GAT CT GAGGAGACT GGAAAT GT AC 362 

181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

II II II II I II I I I I II I I I I I I I I I I I I I I I I I II I I M I I I I I I III 

361 TGTGCCCCACTGAAGCCTACAAAAGCAGCCCGCTCTATCCGTGCCCAGCGCCACACTGAC 302 

241 AT GCCCAAGACCCAGAAGTAT CAGCCCCCATCTACCAACAAGAACACGAAGTCT CA G 297 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 

301 AT GC C CAAGACT C AGAAGT C C C C GT C C CT ATC GACAAACAAGAAAAC GAAGCT GCAAAGG 242 

298 AGAAGGAAAGGAAGTACATTTGAAGAACACAAGTAGAGGGAGTGCAGGAAACAAGAACTA 357 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I III 
241 AGAAGGAAAGGAAGT ACATTT GAAGAACCCAAGT AGAGGAAGT GCAGGAAACAAGACCTA 182 

358 CAGGAT GT A- GAAGAC C CT T CT GAGGAGT GAAGAAGGAC AGGC C AC CGC AGGAC C CTTT G 416 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

181 C AGAAT GT AGGAGGAGC CT C C C AC GGAGC AGAAAAT GC C AC AT C AC CGC AGGAT CCTTT G 122 

417 CTCTGCACAGTTACCTGTAAACATTGGAATACCGGCCA AAAAATAAGT T T GAT C 470 

II I I I I I I I I I I II III I I I I I I I I I II 

121 CTGCTTGAGCAACCT GCAAAACAT CGAAACCC CTACCAAATAACAATAATAAGT CCAAT A 62 

471 AC AT T T C AAAGAT G GC AT T T C C C C CAAT GAAAT ACAC AAGT AAAC AT 517 
I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I 
61 AC AT T ACAAAGAT GGGC AT T T C C C CAATGAAAT AT ACAAGT AAACAT 15 



RESULT 5 
AI248089/c 

LOCUS AI248089 575 bp mRNA linear EST 01-DEC-1998 

DEFINITION qh69f05.xl Soares__f etal_liver_spleen_lNFLS__Sl Homo sapiens cDNA 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



FEATURES 

source 



Craniata; Vertebrata; Euteleostomi ; 
Catarrhini; Hominidae; Homo. 



BASE COUNT 
ORIGIN 



clone IMAGE: 1849953 3* similar to gb : X57025_rnal INSULIN-LIKE 
GROWTH FACTOR IA PRECURSOR (HUMAN);, mRNA sequence. 
AI248089 

AI248089.1 GI:3843486 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 
1 (bases 1 to 575) 

NCI-CGAP http: / /www. ncbi . nlm. nih . gov/ncicgap . 

National Cancer Institute, Cancer Genome Anatomy Project (CGAP) , 

Tumor Gene Index 

Unpublished 

Contact: Robert Strausberg, Ph.D. 
Email: cgapbs-r@mail.nih.gov 

This clone is available royalty-free through LLNL ; contact the 
IMAGE Consortium (info@image.llnl.gov) for further information. 
Insert Length: 918 Std Error: 0.00 
Seq primer: -4 0UP from Gibco 
High quality sequence stop: 380. 

Location/ Qualifiers 

1. .575 

/organism="Homo sapiens" 

/mol_type="mRNA" 

/ db_xr e f - " t axon : 9 6 0 6 " 

/clone=" IMAGE: 1849953" 

/sex="male" 

/dev_stage="2 0 week-post conception fetus" 
/lab_host="DH10B (ampicillin resistant)" 
/clone_lib="Soares__fetal_liver_spleen_lNFLS_Sl" 
/note="0rgan: Liver and Spleen; Vector: pT7T3D (Pharmacia) 
with a modified polylinker; Site_l: Pac I; Site_2: Eco RI; 
This is a subtracted version of the original Soares fetal 
liver spleen 1NFLS library. 1st strand cDNA was primed 
with a Pac I - oligo(dT) primer [5 1 
AACTGGAAGAATTAATTAAAGATCTTTTTTTTTTTTTTTTTTT 3 1 ] , 
double-stranded cDNA was ligated to Eco RI adaptors 
(Pharmacia), digested with Pac I and cloned into the Pac I 
and Eco RI sites of the modified pT7T3 vector. Library 
went through one round of normalization. Library 
constructed by Bento Soares and M.Fatima Bonaldo." 
135 a 152 c 131 g 156 t 1 others 



Query Match 63.2%; 
Best Local Similarity 86.6%; 
Matches 438; Conservative 



Score 326.6; DB 9; 
Pred. No. 3.7e-76; 
0; Mismatches 15; 



Length 575; 
Indels 53; 



Gaps 



5; 



Qy 



Db 



16 TGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGACAGGGGCTTTTATTTC 75 
I I I I I I 1 I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
551 TGCGGGGCTGAGCTGGTGNATGCTCTTCAGTTCGTGTGTGAAGACAGGGGCTTTTATTTC 492 



Qy 



Db 



76 AACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAGACAGGCATCGTGGAT 135 
I I I I I I I I I I II I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
491 AACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAGACAGGCATCGTGGAT 432 



Qy 136 GAGTGCTGCTTCCGGAGCTGTGATCTAAGGAGGCTGGAGATGTATTGCGCACCCCTCAAG 195 

I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 431 GAGT GCTGCTTC C GGAGCT GT GAT CT AAG GAG GCT GGAGAT GT AT TGC GCAC C C CTCAAG 372 

Qy 196 C CT GC CAAGT C AGCT CGCT CT GT C C GT GC C CAGC G CC ACAC C GACAT GC C CAAGACC C AG 255 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 371 C CT GC CAAGT CAGCT CGC TCTGTCCGTGCC CAGC GCCACAC C GACAT GC C CAAGACC CAG 312 

Qy 256 AAGT AT CAGC CC C C ATCT AC CAACAAGAAC AC GAAGT CT CAGAGAAGGAAAGGAAGT AC A 315 

I I I I I I I I I I I 

Db 311 AAGGAAGTACA 301 

Qy 316 TTTGAAGAACACAAGTAGAGGGAGTGCAGGAAACAAGAACT ACAGGAT GTA- GAAGACCC 374 

I I I I I I I I I I I I I I I I II I I I I I I II I I I I I M I I I II I I I I I I I I I I I I MINIM 
Db 300 T T T GAAGAAC GCAAGT AGAGGGAGT GC AG GAAACAAGAACT AC AGGAT GT AGGAAGAC C C 241 

Qy 375 TT C T GAGGAGT GAAGAAGGACAGGCC AC C GCAGGACC CT T T G CT CT GCAC - AGTTAC CT G 433 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 24 0 T C CT GAGGAGT GAAGAGT GAC AT GCC AC C GCAGGAT C CT T T GCT CT GCAC GAGTT ACCT G 181 

Qy 434 - TAAAC AT T GGAATACC G GC CAAAAAAT AAGTT T GAT CAC AT T T CAAAGAT - GGCATTT C 491 

I I I I I I I I I I I III I I I I I I I I I I I I I I M I I I I I I I I I I I I I I III I I I I 
Db 18 0 TTAAACT T T GGAACACCT AC CAAAAAAT AAGTT T GAT AACAT T TAAAAGAT GGGCGT T T C 121 

Qy 4 92 C C C C AAT GAAAT AC AC AAGT AAAC AT 517 

I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 12 0 C C C C AAT GAAAT AC AC AAGT AAACAT 95 



RESULT 6 

AI169253/c 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
COMMENT 



AI169253 549 bp mRNA linear EST 08-JAN-1999 

EST215088 Normalized rat kidney, Bento Soares Rattus sp. cDNA clone 
RKIBP33 3' end, mRNA sequence. 
AI169253 

AI169253.1 GI:4134375 
EST. 

Rattus sp. 
Rattus sp. 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; 
Rattus . 

1 (bases 1 to 549) 

Lee,N.H., Glodek,A. , Chandra, I., Mason, T.M., Quackenbush, J . , 
Kerlavage, A. R. and Adams, M.D. 

Rat Genome Project: Generation of a Rat EST (REST) Catalog & Rat 

Gene Index 

Unpublished 

On Oct 6, 1998 this sequence version replaced gi: 3705561. 
Other_ESTs: TC5077 9 
Contact: Lee, NH 

The Institute for Genomic Research 

9712, Medical Center Drive, Rockville, MD 20850, USA 
Tel: (30D-838-3529 
Fax: (301) -838-0208 
Email: nhlee@tigr.org 
Seq primer: M13-21. 



FEATURES Location/Qualifiers 
source 1. .549 

/ organism="Rattus sp . " 
/mo l_t ype= "mRNA" 
/db_xref="taxon: 10118" 
/clone="RKIBP33" 

/clone_lib="Normalized rat kidney, Bento Soares" 
/note="Organ: kidney; Vector: pT7T3Pac; Site_l: EcoRI; 
Site_2: NotI" 

BASE COUNT 112 a 140 c 133 g 164 t 

ORIGIN 

Query Match 61.2%; Score 316.6; DB 9; Length 549; 

Best Local Similarity 80.8%; Pred. No. 1.7e-73; 

Matches 421; Conservative 0; Mismatches 89; Indels 11; Gaps 4; 

Qy 8 AGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGACAGGGGCT 67 

I I I I II I I I II I I I I I I I I I I I I I I I MINIM I I | | | | | | | || | I I I I I I I 
Db 549 AGACCCTTTGCGGGGCTGAGCTGGTGGACGCTCTTCAATTCGTGTGTGGACCAAGGGGCT 490 

Qy 68 T T TAT T T CAACAAGC C CACAGGGTAT GGCT C CAGCAGT C G GAGGGC GC CT CAGACAGGC A 127 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II II I I I I I I I I I 
Db 4 89 TTTACTTCAACAAGCCCACAGGCTATGGCTCCAGCATTCGAAGGGCACCACAGACGGGCA 430 

Qy 12 8 T CGT GGAT GAGTGCT GCTT CCGGAGCTGT GAT CTAAGGAGGCT GGAGAT GTATTGCGCAC 187 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I 
Db 429 T T GT GGAT GAGT GT AGCT T C CGGAGCTGT GAT CT GAGGAG GCT GGAGAT GT ACTGT GCT C 370 

Qy 188 CCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGACATGCCCA 247 

I II I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 369 C GCT GAAGC CT ACAAAGT CAGCTCGTT CC AT C C G GGC C CAGCGC C AC ATT GACATGCCCA 310 

Qy 24 8 AGAC C CAGAAGTAT CAGC C CC C AT CTACCAACAAGAAC AC GAAGT CT CA GAGAAGGA 304 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
Db 309 AGACT CAGAAGTC CCAGCCCCTATCGACACACAAGAAAAGGAAGCT GCAAAGGAGAAGGA 250 

Qy 305 AAGGAAGT AC AT TT GAAGAACACAAGT AGAGGGAGT GCAGGAAACAAGAACT ACAGGAT G 364 

I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I III 

Db 249 AAGGAAGT ACACT T GAAGAACACAAGT AGAGGAAGT GC AGGAAACAAGACT T ACAGAAT G 190 

Qy 365 TA- GAAGAC C CTT CT GAGGAGT GAAGAAG GAC AGGC CAC C GC AGGAC C CT TT G C T CT GC A 423 

II II II III I I I I I I I II I II I I I I I I I I II I I I I I I I I 

Db 18 9 T AGGAGGAGC CT C C C GAGGAAC AGAAAAT GC C AC GT CAC C GCAAGAT CCTTTGCTGCTTG 130 

Qy 424 CAGTTACCTGTAAACATTGGAATACCGGCCA AAAAAT AAGT T T GAT CAC AT T T C 477 

I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I 

Db 129 AGCAACCT GCAAAACAT CGGAACACCTGCCAAAT AT CAATAAT GAGTTCAATACCATTTC 70 

Qy 478 AAAG AT - G G CAT T T C C C C CAAT GAAAT AC ACAAG T AAAC AT 517 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I I 
Db 69 AGAGAT GGGC AT T T C C CT CAAT GAAAT AC AC AAGT AAAC AT 2 9 



RESULT 7 
AI265629/c 

LOCUS AI265629 558 bp mRNA linear EST 18-NOV-1998 

DEFINITION uj04b07.xl Sugano mouse liver mlia Mus musculus cDNA clone 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



IMAGE: 1890901 3' similar to gb:X04482 Mouse mRNA for 
preproinsulin-like growth factor IB (MOUSE) ; , mRNA sequence. 
AI265629 

AI265629.1 GI:3883787 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Rodentia; 
1 (bases 1 to 558) 
Marra,M., Hillier,L 
Geisel,S., Kucaba,T 



TITLE 
JOURNAL 
COMMENT 



FEATURES 

source 



Craniata; Vertebrata; Euteleostomi; 
Sciurognathi ; Muridae; Murinae; Mus - 



BASE COUNT 
ORIGIN 



Allen, M. , Bowles, M. , Dietrich, N., Dubuque, T., 
Lacy,M., Le,M. , Martin, J., Morris, M. , 
Schellenberg, K. , Steptoe,M. , Tan,F., Underwood, K . , Moore, B., 
Theising,B., Wylie,T., Lennon,G., Soares,B., Wilson, R. and 
Waterston, R. 

The WashU-HHMI Mouse EST Project 
Unpublished 

Contact: Marra M/Mouse EST Project 

WashU-HHMI Mouse EST Project 

Washington University School of MedicineP 

4444 Forest Park Parkway, Box 8501, St. Louis, MO 63108 

Tel: 314 286 1800 

Fax: 314 286 1810 

Email: mouseest@watson.wustl.edu 

This clone is available royalty-free through LLNL ; contact the 
IMAGE Consortium (info@image.llnl.gov) for further information. 
MGI: 975225 

Seq primer: custom primer used 
High quality sequence stop: 495. 

Location/Qualifiers 

1. .558 

/organism="Mus musculus" 
/mol__type="mRNA" 
/strain="C57BL" 
/db_xref="taxon: 10090" 
/clones "IMAGE: 1890901" 
/sex="female" 
/dev_stage="adult" 
/lab_host="DH10B" 

/clone_lib="Sugano mouse liver mlia" 

/note="0rgan: liver; Vector: pME18S-FL3; Site_l: Dralll 
(CACTGTGTG) ; Site_2 : Dralll (CACCATGTG) ; 1st strand cDNA 
was primed with an oligo(dT) primer 

[ATGTGGCCTTTTTTTTTTTTTTTTT] ; double-stranded cDNA was 
ligated to a Dralll adaptor [TGTTGGCCTACTGG] , digested 
and cloned into distinct Dralll sites of the pMEl8S-FL3 
vector (5 1 site CACTGTGTG, 3' site CACCATGTG). Xhol should 
be used to isolate the cDNA insert. Size selection was 
performed to exclude fragments <1.5kb. Library 
constructed by Dr. Sumio Sugano (University of Tokyo 
Institute of Medical Science) . Custom primers for 
sequencing: 5' end primer CTTCTGCTCTAAAAGCTGCG and 3' end 
primer C GAC CT GC AGCT C GAGCAC A . " 
106 a 135 c 156 g 161 t 



Query Match 



61.1%; Score 315.8; DB 9; Length 558; 



Best Local Similarity 80.8%; Pred. No. 2.7e-73; 

Matches 408; Conservative 0; Mismatches 87; Indels 10; Gaps 3; 



Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 506 GGACCAGAGACCCTTTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGACCG 447 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I II I I II I I I I I I I I I 
Db 44 6 AGGGGCTTTTACTTCAACAAGCCCACAGGCTATGGCTCCAGCATTCGGAGGGCACCTCAG 387 

Qy 121 ACAGGCAT CGTGGAT GAGTGCT GCTT CCGGAGCT GT GATCTAAGGAGGCTGGAGATGTAT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 38 6 AC AGGCATTGT GGAT GAGT GT T GCT T C C GGAGCT GT GAT CT GAGGAGACT G GAGAT GT AC 7 327 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

II II II II I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I III 

Db 326 TGTGCCCCACTGAAGCCTACAAAAGCAGCCCGCTCTATCCGTGCCCAGCGCCACACTGAC 267 

Qy 241 AT GC CCAAGAC C CAGAAGT AT CAGCC C CC AT CT AC CAACAAGAACAC GAAGT CT C A G 2 97 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 266 AT GC CCAAGACT CAGAAGT C C C C GT C C CT AT C GACAAACAAGAAAAC GAAG CT GCAAAGG 2 07 

Qy 298 AGAAGGAAAGGAAGT AC AT T T GAAGAAC ACAAGT AGAGGGAGT GCAGGAAACAAGAACT A 357 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 
Db 206 AGAAGGAAAGGPAGTACATTTGAAGAACACAAGTAGAGGAAGTGCAGGAAACAAGACCTA 147 

Qy 358 C AGGAT GT A- GAAGAC C CT T CT GAGGAGT GAAGAAGGACAGGC CAC CGC AGGAC C CT T T G 416 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 146 C AGAAT GT AG GAGGAGC CT C C C AC GGAGC AGAAAAT GCCAC AT CAC C GCAG GAT C CT T T G 87 

Qy 417 CT CT GC ACAGTTAC CT GTAAAC AT T GGAAT AC C GGC CA AAAAAT AAGTT T GAT C 470 

II I I I I II I I I I I I I III I I I I I I I I I I I 

Db 8 6 CT GCTT GAGCAACCTGCAAAACAT CGAAACACCTACCAAATAACAATAATAAGT CCAAT A 27 

Qy 471 ACATTTCAAAGATGGCATTTCCCCC 4 95 

I I I I I I I I II II M II I I I I 
Db 26 ACATTACAAAGATGGGCATTTCCCC 2 



RESULT 8 

AA542914/c 

LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 



AA542914 498 bp mRNA linear EST 19-AUG-1997 

ni98cl0.sl NCI_CGAP_Pr21 Homo sapiens cDNA clone IMAGE: 984882 3' 
similar to gb :X57025_rnal INSULIN-LIKE GROWTH FACTOR IA PRECURSOR 
(HUMAN) ; r mRNA sequence. 
AA542914 

AA542 914 .1 GI: 22 91394 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Horninidae; Homo. 
1 (bases 1 to 498) 

NCI-CGAP http : / /www. ncbi . nlm. nih . gov/ ncicgap . 

National Cancer Institute, Cancer Genome Anatomy Project (CGAP) , 
Tumor Gene Index 



JOURNAL 
COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



Unpublished 

Contact: Robert Strausberg, Ph.D. 
Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Michael J. Brownstein, M.D., Ph.D., Michael R. 
Emmert-Buck, M.D., Ph.D. 

cDNA Library Preparation: M. Bento Soares, Ph.D. 

cDNA Library Arrayed by: Greg Lennon, Ph.D. 

DNA Sequencing by: Washington University Genome Sequencing Center 
Clone distribution: NCI-CGAP clone distribution information can be 
found through the I.M.A.G.E. Consortium/LLNL at: 
www-bio . llnl . gov/bbrp/ image/ image . html 
Insert Length: 603 Std Error: 0.00 
Seq primer: -40ml3 fwd. ET from Amersham 
High quality sequence stop: 412. 
Location/Qualifiers 
1. .498 

/organism— "Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
/clone-" IMAGE : 9 8 4 8 8 2 " 
/ sex="male" 

/tissue_type- "normal prostate" 

/lab_host="DH10B" 

/clone_lib="NCI_CGAP_Pr21" 

/note="Organ: prostate; Vector: pT7T3D-Pac (Pharmacia) 
with a modified polylinker; 1st strand cDNA was prepared 
from normal prostate bulk tissue, and was then primed with 
a Not I - oligo(dT) primer. Double-stranded cDNA was 
ligated to Eco RI adaptors (Pharmacia) , digested with Not 
I and cloned into the Not I and Eco RI sites of the 
modified pT7T3 vector. Library is not normalized. Library 
was constructed by Bento Soares and M. Fatima Bonaldo. " 
105 a 135 c 123 g 135 t 



Query Match 60.9%; 
Best Local Similarity 86.2%; 
Matches 450; Conservative 



Score 314.8; DB 9; 
Pred. No. 4.9e-73; 
0; Mismatches 17; 



Length 498; 
Indels 55; 



Gaps 



7; 



Qy 



Db 



1 GGACCGGAGACGCTCTGCGGGGC-TGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGA 59 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I 
476 GGACCGGAGAACTTTTGCGGGGCTTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGA 417 



Qy 

Db 

Qy 

Db 



60 CAGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCA 119 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
416 CAGGGGC-TTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCA 358 

120 GACAGGCAT CGT GGATGAGT GCTGCTT CCGGAGCT GT GATCTAAGGAGGCT GGAGATGT A 17 9 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I 
357 GACAGGCAT C GT GGAT GAGT GCT GCT T C C GGAG CT GT GATCTAAGGAGGCT GGAGAT GT A 2 98 



Qy 



Db 



180 TTGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGA 239 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
297 TTGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGA 238 



Qy 



240 CAT GCCCAAGACCCAGAAGTAT CAGCCCCCAT CT ACCAACAAGAACACGAAGTCT CAGAG 2 99 
I I I I I I I I I I I I I I I I 



Db 



237 CAT GCCCAAGACCCAG 



222 



Qy 300 AAGGAAAGGAAGTACATTTGAAGAACACAAGTAGAGGGAGTGCAGGAAACAAGAACTACA 359 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 

Db 221 AAGGAAGT ACAT TT GAAGAAC GCAAGT AGAGGGAGT GCAGGAAACAAGAACT AC A 167 

Qy 360 GGAT GT A- GAAGAC C CTT CT GAGGAGT GAAGAAG GACAG GC CACC GCAGGAC CCTTTGCT 418 

MINI I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 166 GGAT GT AGGAAGAC C CT C CT GAGGAGT GAAGAGT GACAT GC CAC C GC AGGAT CCTTTGCT 107 

Qy 419 CT G CAC - AGT T ACCT G - T AAAC ATT GGAATAC C GG C CAAAAAAT AAGT TT GAT C ACAT T T 476 

I I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I 
Db 106 CT G CAC GAGTT ACCT GT T AAACT TT GGAACAC CT AC CAAAAAAT AAGT TT GATAACAT TT 47 

Qy 4 77 CAAAGAT-GGCATTTCCCCCAATGAAATACACAAGTAAACAT 517 

I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 4 6 AAAAGAT G GGC GTT T C C C C CAAT GAAAT ACACAAGTAAAC AT 5 



RESULT 9 
CD373004 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
COMMENT 



FEATURES 

source 



CD373004 614 bp mRNA linear EST 29-MAY-2003 

UI-R-GR0-csv-j-17-0-UI.rl UI-R-GR0 Rattus norvegicus cDNA clone 
UI-R-GRO-csv- j-17-0-UI 5 ! , mRNA sequence. 
CD373004 

CD373004.1 GI: 31157094 
EST. 

Rattus norvegicus (Norway rat) 
Rattus norvegicus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; 
Rattus . 

1 (bases 1 to 614) 

Bonaldo,M. F. , Lennon,G. and Soares, M.B. 

Normalization and subtraction: two approaches to facilitate gene 
discovery 

Genome Res. 6 (9), 791-806 (1996) 

97044477 

8889548 

Contact: Soares, MB 

Coordinated Laboratory for Computational Genomics 
University of Iowa 

375 Newton Road , 4156 MEBRF, Iowa City, IA 52242, USA 

Tel: 319 335 8250 

Fax: 319 335 9565 

Email: bento-soares@uiowa.edu 

Tissue Procurement: James Lin, University of Iowa 
cDNA Library preparation: Dr. M. Bento Soares, University of Iowa 
cDNA Library Arrayed by: Dr. M. Bento Soares, University of Iowa 
DNA Sequencing by: Dr. M. Bento Soares, University of Iowa 
Clone Distribution: Distribution information can be found at 
http : / / genome . uiowa . edu/ distribution/rat . html 
Seq primer: M13 REVERSE . 

Location/Qualifiers 
1. .614 

/organism="Rattus norvegicus" 
/mol type="mRNA" 



/strain="Sprague-Dawley" 
/db_xref="taxon: 10116" 
/clone="UI-R-GR0-csv-j-17-0-UI" 
/tissue_type="Whole embryo" 
/dev_stage=" embryo 13dpc" 

/lab_host="DH10B (Life Technologies) (Tl phage resistant)" 
/ cl on e_l ib= " U I - R- GRO " 

/note= "Vector : pYX-Asc; SiteJL: EcoR I; Site_2: Not I; 
UI-R-GRO is a cDNA library containing the following 
tissue (s): rat whole embryo 13dpc. The library was 
constructed according to Bonaldo, Lennon and Scares , 
Genome Research, 6:791-806, 1996. Denatured RNA was size 
fractionated on a 1% agarose gel. First strand cDNA 
synthesis was primed with oligo-dT primer containing a Not 
I site. Double strand cDNA was size selected according to 
mRNA size fraction, ligated with EcoR I adaptor, digested 
with NotI and then cloned directionally into pYX-Asc 
vector. The library tag sequence located between the Not I 
site and the polyA tail is CATCTCTACT. This library was 
created for the University of Iowa Program for Rat Gene 
Discovery and Mapping (Val Sheffield, Bento Soares and Tom 
Casavant ) . " 

BASE COUNT 171 a 168 c 154 g 119 t 2 others 

ORIGIN 



Query Match 60.0%; Score 310; DB 14; Length 614; 

Best Local Similarity 80.3%; Pred. No. 9.9e-72; 

Matches 388; Conservative 0; Mismatches 91; Indels 4; Gaps 2; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

Mill | | | | | || I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 116 GGACCAGAGACCCTTTGCGGGGCTGAGCTGGTGGACGCTCTTCAGTTCGTGTGTGGACCA 175 



Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II III 
Db 176 AGGGGCTTTTACTTCAACAAGCCCACAGGCTATGGCTCCAGCATTCGGAGGGCACCACAG 235 



Qy 121 ACAGGCAT C GTGGAT GAGT GCT GCTT C C G GAGCT GT GAT CTAAGGAGGCT GGAGAT GT AT 180 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 236 ACGGGCATT GTGGAT GAGT GTTGCTT CCGGAGCT GT GATCT GAGGAGGCT GGAGAT GTAC 2 95 



Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

II II II II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I III 
Db 296 TGTGCTCCGCTGAAGCCTACAAAGTCAGCTCGTTCCATCCGGGCCCAGCGCCACACTGAC 355 

Qy 241 AT GCCCAAGACCCAGAAGTATCAGCCCCCATCTAC CAACAAGAACACGAAGTCT CA G 2 97 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I 

Db 356 AT GCCCAAGACT CAGAAGT CCCAGCCCCTATCGACACACAAGAAAAGGAAGCTGCAAAGG 415 



Qy 298 AGAAGGAAAGGAAGT ACAT T T GAAGAAC ACAAGT AGAGGGAGT GCAGGAAACAAGAACTA 357 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I III 

Db 416 AGAAGGAAAGGAAGT ACACT T GAAGAACACAAGT AGAGGAAGT GCAGGAAACAAGAC CTA 475 

Qy 358 CAGGATGT A- GAAGACCCTT CT GAGGAGT GAAGAAGGACAGGCCACCGCAGGACCCTTTG 416 

II I I I I I I II II III I I I I I I I II II I I I I I I I I II I I I I I I 

Db 476 CAGAATGTAGGAGGAGCCTCCCGAGGAACAGAAAATTCCACGT CACCGCAT GATCCTTT G 535 



Qy 417 CT CT GCAC AGTTAC CT GTAAACAT T GGAAT AC CGGC CAAAAAATAAGT TT GAT C ACATTT 47 6 

II I I I I I I IE I I I I I I I I I I I I || I I I III 

Db 536 CT GCT T GAGCAAC CT GCANAACAT C GGAACACCT GC CAAAT AT CAATAAT GAGT T CAAT A 595 

Qy 477 CAA 479 

I I 

Db 596 CCA 598 



RESULT 10 
AI119218 

LOCUS AI119218 816 bp mRNA linear EST 02-SEP-1998 

DEFINITION ue94h02.yl Sugano mouse embryo mewa Mus musculus cDNA clone 
IMAGE: 1498803 5 1 similar to gb:X04482 Mouse mRNA for 
preproinsulin-like growth factor IB (MOUSE);, mRNA sequence. 
ACCESSION AI119218 

VERSION AI119218.1 GI:3519542 

KEYWORDS EST. 

SOURCE Mus musculus (house mouse) 

ORGANISM Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
REFERENCE 1 (bases 1 to 816) 

AUTHORS Marra,M., Hillier,L., Allen, M. , Bowles, M. , Dietrich, N., Dubuque, T., 

Geisel,S., Kucaba,T., Lacy,M., Le,M. , Martin, J., Morris, M., 

Schellenberg, K. , Steptoe,M., Tan, F. , Underwood, K. , Moore, B., 

Theising,B., Wylie,T., Lennon,G., Soares,B., Wilson, R. and 

Waterston, R. 
TITLE The WashU-HHMI Mouse EST Project 

JOURNAL Unpublished 
COMMENT Contact: Marra M/Mouse EST Project 

WashU-HHMI Mouse EST Project 

Washington University School of MedicineP 

4444 Forest Park Parkway, Box 8501, St. Louis, MO 63108 

Tel: 314 286 1800 

Fax: 314 286 1810 

Email: mouseest@watson.wustl.edu 

This clone is available royalty-free through LLNL ; contact the 
IMAGE Consortium (info@image.llnl.gov) for further information. 
MGI: 936407 

Seq primer: custom primer used 
High quality sequence stop: 47 3. 
FEATURES Location/Qualifiers 
source 1 . . 816 

/organism="Mus musculus" 
/ mol_type= "mRNA" 
/strain="C57BL" 
/db_xref="taxon: 10090" 
/clone="IMAGE: 1498803" 
/dev_stage=" embryo, 14 dpc" 
/lab_host="DH10B" 

/clone_lib-" Sugano mouse embryo mewa" 

/note="Vector : pME18S-FL3; Site_l: Dralll (CACTGTGTG) ; 
Site_2: Dralll (CACCATGTG) ; 1st strand cDNA was primed 
with an oligo(dT) primer [ATGTGGCCTTTTTTTTTTTTTTTTT] ; 
double-stranded cDNA was ligated to a Dralll adaptor 
[TGTTGGCCTACTGG] , digested and cloned into distinct Dralll 



BASE COUNT 
ORIGIN 



sites of the pME18S-FL3 vector (5' site CACTGTGTG, 3' site 
CACCATGTG) . Xhol should be used to isolate the cDNA 
insert. Size selection was performed to exclude fragments 
<1.5kb. Library constructed by Dr. Sumio Sugano 
(University of Tokyo Institute of Medical Science) . 
Custom primers for sequencing: 5* end primer 
CTTCTGCTCTAAAAGCTGCG and 3' end primer 
CGACCTGCAGCTCGAGCACA. " 
230 a 219 c 172 g 187 t 8 others 



Query Match 59.8%; 
Best Local Similarity 80.2%; 
Matches 384; Conservative 



Score 309; DB 9; Length 816; 
Pred. No. 2e-71; 
0; Mismatches 91; Indels 



4 ; Gaps 



2; 



Qy 



Db 



1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 
I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
323 GGACCAGAGACCCTTTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGACCG 382 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
383 AGGGGCTTTTACTTCAACAAGCCCACAGGCTATGGCTCCAGCATTCGGAGGGCACCTCAG 442 

121 AC AGGCAT C GT GGAT GAGT GCT GCT T C C GGAGCT GT GAT CTAAGGAGGCT GGAGAT GT AT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
44 3 ACAGGC AT T GT GGAT GAGT GT T GCT T C C GGAGCT GT GAT CT GAGGAGACT GGAGAT GT AC 502 



181 



503 



241 



563 



298 



TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 
II II II II I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I II I I I I III 
TGTGCCCCACTGAAGCCTACAAAAGCAGCCCGCTCTATCCGTGCCCAGCGCCACACTGAC 

AT GC C CAAGAC C CAGAAGT AT C AGCC C C CAT CT AC CAACAAGAAC AC GAAGT CT C A G 

I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
AT GC C CAAGACT CAGAAGT C C C C GT C C CT AT C GACAAACAAGAAAAC GAAGCT GCAAAGG 



240 



562 



297 



622 



357 



AGAAGGAAAGGAAGT AC AT T T GAAGAACACAAGT AGAGGGAGT GC AGGAAACAAGAACT A 
I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 
623 AGAAGGAAAGGAAGT AC ATT T GAAGAACACAAGT AGAGGAAGT GCAN GAAACAAGACCT A 

358 C AGGAT GT A- GAAGAC C CT T CT GAGGAGT GAAGAAGGAC AGGC CAC C GCAGGAC C CTTT G 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I 
683 C AGAAT GTAN GAGGAGC CTN C C ACGGAGC AGAAN AT GC C AC AT CACCGCAN GAT C CT TT G 

417 CT CT GC ACAGT T AC CT GTAAAC ATT GGAAT AC C GGC CAAAAAATAAGTTT GAT C ACATT 475 

II I I M I I I I I I I I I I I I I I I I III III 

743 CT GCTT GAGCAACCTGCANAACATCGAAACACCT ACCAAATAACATNTATAAGT CCAAT 8 01 



682 



416 



742 



RESULT 11 

BF383724 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 



BF383724 594 bp mRNA linear EST 27-NOV-2000 

602044632F1 NCI_CGAP_Li9 Mus musculus cDNA clone IMAGE: 4194295 5', 
mRNA sequence . 
BF383724 

BF383724. 1 GI: 11365029 
EST. 

Mus musculus (house mouse) 



ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



Mus mus cuius 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus, 
1 (bases 1 to 594) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished 

Contact: Robert Strausberg, Ph.D. 

Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Jeffrey E. Green, M.D. 

cDNA Library Preparation: Life Technologies, Inc. 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: Incyte Genomics, Inc. 

Clone distribution: MGC clone distribution information can be 
found through the I.M.A.G.E. Consortium/LLNL at: 
http : //image . llnl . gov 
Plate: LLAM9527 row: p column: 08 
High quality sequence stop: 589. 

Location/ Qualifiers 

1. .594 

/organism="Mus musculus" 
/mo 1_ t y p e = " mRN A " 
/strain="FVB/N" 
/db_xref="taxon: 10090" 
/clones" IMAGE: 4194295" 

/lab_host="DH10B (Tl phage-resistant) " 
/clone_lib="NCI_CGAP_Li9" 

/note="0rgan: liver; Vector: pCMV-SPORT6; Site_l: NotI; 
Site_2 : Sail; Cloned unidirectionally . Primer: Oligo dT. 
Average insert size 1.9 kb . Constructed by Life 
Technologies. Note: this is a NCI_CGAP Library." 
175 a 162 c 142 g 115 t 



Query Match 58.7%; 
Best Local Similarity 80.7%; 
Matches 394; Conservative 



Score 303.6; DB 10; 
Pred. No. 4.9e-70; 
0; Mismatches 84; 



Length 594; 
Indels 10; 



Gaps 



3; 



Qy 



Db 



16 TGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGACAGGGGCTTTTATTTC 75 
I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I III 
107 TGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGACCGAGGGGCTTTTACTTC 166 



Qy 

Db 

Qy 

Db 

Qy 

Db 



7 6 AACAAGC CCAC AGGGT AT GGCT C C AGCAGTCG GAGG GC GC CT C AGACAGGC AT CGT GGAT 135 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
167 AACAAGC C CAC AGGCT AT GGCT C C AGCAT TCG GAGGGCAC CT CAGACAGGC ATT GT GGAT 226 

136 GAGTGCTGCTTCCGGAGCTGTGATCTAAGGAGGCTGGAGATGTATTGCGCACCCCTCAAG 195 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II II II II II III 
227 GAGT GT T GCT T CC GGAGCT GT GAT CT GAG GAGACT GGAGAT GT ACT GT GC C C C ACT GAAG 28 6 

196 C CT GC CAAGT C AGCT CGCT CT GT CCGT GC CCAGC G C CACAC C GAC AT GCC CAAGAC C C AG 255 

III I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I III 

287 C CT ACAAAAGC AGC C C GCT CT AT CC GT GC CCAGC G C CAC ACT GAC AT GCC CAAGACT C AG 34 6 



Qy 

Db 



256 AAGT AT CAGCC C C CAT CT AC C AACAAGAAC AC GAAGT CT C A GAGAAG GAAAG GAAGT 312 

I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I II 
347 AAGT C CC C GT C CCT AT CGACAAACAAGAAAAC GAAGCT GCAAAGGAGAAG GAAAGGAAGT 406 



Qy 313 AC AT T T GAAGAACACAAGTAGAGGGAGT GC AGGAAACAAGAACT ACAGGAT GT A- GAAGA 371 

I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II 
Db 407 ACATTT GAAGAACACAAGTAGAGGAAGTGCAGGAAACAAGACCTACAGAAT GTAGGAGGA 4 66 

Qy 372 C C CT T CT GAGGAGT GAAGAAGGAC AGGCC AC C GC AGGAC C CT TTGCT CT GCAC AGT T ACC 4 31 

I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I 
Db 467 G C CT C C C AC GGAGCAGAAAAT GC C ACAT C AC C GC AGGAT C CTTT GCT GCTT GAGCAACCT 526 

Qy 432 T GT AAAC AT T GGAAT AC C GGC C A AAAAATAAGT T T GAT C AC ATTT CAAAGATGG 4 85 

I I I I I I I II I I I III MINIMI I I I I I I I I I I I I I I I I 

Db 527 GCAAAACAT CGAAACAC CTACCAAATAACAATAATAAGT CCAATAACATTACAAAGATGG 58 6 

Qy 486 CATTTCCC 493 

I I I I 

Db 587 GCATTGCC 594 



RESULT 12 

AA913900/c 

LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



FEATURES 

source 



Craniata; Vertebrata; Euteleostomi ; 
Catarrhini; Hominidae; Homo. 



AA913900 527 bp mRNA linear EST 24-SEP-199£ 

ol35g05.s2 Soares_NFL_T_GBC_Sl Homo sapiens cDNA clone 

IMAGE: 1525496 3' similar to gb :X57025__rnal INSULIN-LIKE GROWTH 

FACTOR IA PRECURSOR (HUMAN);, mRNA sequence. 

AA913900 

AA913900.1 GI: 3053292 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 
1 (bases 1 to 527) 

NCI-CGAP http : //www. ncbi .nlm. nih . gov/ncicgap. 

National Cancer Institute, Cancer Genome Anatomy Project (CGAP) , 

Tumor Gene Index 

Unpublished 

Contact: Robert Strausberg, Ph.D. 
Email: cgapbs-r@mail.nih.gov 

This clone is available royalty-free through LLNL ; contact the 
IMAGE Consortium (info@image.llnl.gov) for further information. 
Insert Length: 870 Std Error: 0.00 
Seq primer: -4 0ml 3 fwd. ET from Amersham 
High quality sequence stop: 97. 

Location/Qualifiers 

1. .527 

/organism="Homo sapiens" 

/mol_type="mRNA" 

/dbjcref ="taxon : 9606" 

/clone=" IMAGE: 1525496" 

/lab_host="DH10B" 

/ clone_lib- " Soa res_NFL_T_GBC_S 1 " 

/note="Organ: pooled; Vector: pT7T3D-Pac (Pharmacia) with 
a modified polylinker; Site_l: Not I; Site_2 : Eco RI ; 
Equal amounts of plasmid DNA from three normalized 
libraries (fetal lung NbHL19W, testis NHT, and B-cell 
NCI_CGAP_GCB1) were mixed, and ss circles were made in 
vitro. Following HAP purification, this DNA was used as 



BASE COUNT 
ORIGIN 



tracer in a subtractive hybridization reaction. The driver 
was PCR- amplified cDNAs from pools of 5,000 clones made 
from the same 3 libraries. The pools consisted of 
I.M.A.G.E. clones 297480-302087, 682632-687239, 
726408-728711, and 729096-731399. Subtraction by Bento 
Soares and M. Fatima Bonaldo. " 
125 a 134 c 119 g 149 t 



Query Match 58.0%; 
Best Local Similarity 85.5%; 
Matches 413; Conservative 



Score 299.8; DB 9; 
Pred. No. 4.9e-69; 
0; Mismatches 17; 



Length 527; 
Indels 53; 



Gaps 



5; 



Qy 



Db 



39 T CT T CAGTT C GT GT GT GGAGAC AG GGGCT T T TAT T T CAACAAGCC C AC AGGGT AT GGCT C 98 

I I I I I I I I I II I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I 

527 T C T T C AGT T C GT GT GT GGAGAC AGGGGCT T TAT T TACAACAAGCC CAC AGGGT AT GGCT C 468 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 



99 



467 



159 



407 



219 



347 



279 



CAGCAGTCGGAGGGCGCCTCAGACAGGCATCGTGGATGAGTGCTGCTTCCGGAGCTGTGA 158 
I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CAGCAGTCGGAGGGCGCCTAAGACAGGCATCGTGGATGAGTGCTGCTTCCGGAGCTGTGA 4 08 

TCTAAGGAGGCTGGAGATGTATTGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGT 218 
I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
TCTAAGGAGGCTGGAGATGTATTGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGT 348 

C C GT GCC CAGCGC CACAC C GAC AT GCC CAAGAC C C AGAAGTAT C AGC C C C CAT CT ACCAA 27 8 
I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 

CCGT GCCCAGCGC CACAC CGACAT GCCCAAGACCCAG 311 



338 



CAAGAACAC GAAGT CT C AGAGAAGGAAAGGAAGT ACAT T TGAAGAAC ACAAGT AGAGGGA 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
310 AAGGAAGT ACAT T TGAAGAAC GCAAGT AGAGGGA 277 

339 GT GCAG GAAACAAGAACT AC AGGAT GT A- GAAGAC C CTT CT GAGGAGT GAAGAAGGACAG 397 

I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
276 GT GCAGGAAACAAGAACT ACAGGAT GT AGGAAGAC C CT C CTGAGGAGT GAAGAGT GACAT 217 

398 GCCACCGCAGGACCCTTTGCTCTGCAC-AGTTACCTG-TAAACATTGGAATACCGGCCAA 455 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I 
216 GC CAC C GCAGGAT C CTT T GCT CT GCAC GAGT TAC CT GT TAAACT T T GGAACAC CT ACCAA 157 

456 AAAATAAGTTTGATCACATTTCAAAGAT - GGCATTT CCCCCAAT GAAATACACAAGTAAA 514 

I II I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I 
156 AAAAT AAGTTT GATAAC ATT TAAAAGAT G GGC GT T T C CC CCAAT GAAATACACAAGTAAA 97 

515 CAT 517 
I I I 
96 CAT 94 



RESULT 13 
AI876493/c 

LOCUS AI876493 642 bp mRNA linear EST 21-JUL-1999 

DEFINITION uj59bl0.xl Sugano mouse liver mlia Mus musculus cDNA clone 

IMAGE: 1924219 3' similar to gb : X57025_rnal INSULIN-LIKE GROWTH 
FACTOR IA PRECURSOR (HUMAN); gb:X04482 Mouse mRNA for 
preproinsulin-like growth factor IB (MOUSE);, mRNA sequence. 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 
JOURNAL 
COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



AI876493 

AI876493.1 GI: 5550542 
EST. 

Mus mus cuius (house mouse) 
Mus mus cuius 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 642) 

Marra,M., Hillier,L., Kucaba,T., Martin, J., Beck,C, Wylie,T., 

Underwood, K. , Steptoe,M., Theising,B., Allen, M. , Bowers, Y. , Person 

,B., Swaller,T., Gibbons, M. , Pape,D., Harvey, N., Schurk,R., Ritter 

,E., Kohn,S., Shin,T., Jackson, Y. , Cardenas, M., McCann,R., 

Waterston,R. and Wilson, R. 

The WashU-NCI Mouse EST Project 1999 

Unpublished 

Contact: Marra M/WashU-NCI Mouse EST Project 1999 
Washington University School of Medicine 

4444 Forest Park Parkway, Box 8501, St. Louis, MO 63108, USA 
Tel: 314 286 1800 
Fax: 314 286 1810 

Email: mouseest@watson.wustl.edu 

This clone is available royalty-free through LLNL ; contact the 
IMAGE Consortium (info@image.llnl.gov) for further information. 
MGI: 980511 

Seq primer: custom primer used 
High quality sequence stop: 257. 

Location/ Qualifiers 

1. .642 

/organism="Mus musculus" 

/mol_type= ,, mRNA" 

/strain="C57BL" 

/db_xref="taxon: 10090" 

/clone="IMAGE: 1924219" 

/sex=" female" 

/ de v_s t age= " adul t " 

/lab_host="DH10B" 

/clone_lib="Sugano mouse liver mlia" 

/note="0rgan: liver; Vector: pME18S-FL3; Site_l: Dralll 
(CACTGTGTG); Site_2 : Dralll (CACCATGTG) ; 1st strand cDNA 
was primed with an oligo(dT) primer 

[ATGTGGCCTTTTTTTTTTTTTTTTT] ; double-stranded cDNA was 
ligated to a Dralll adaptor [TGTTGGCCTACTGG] , digested 
and cloned into distinct Dralll sites of the pME18S-FL3 
vector (5 ? site CACTGTGTG, 3' site CACCATGTG). Xhol should 
be used to isolate the cDNA insert. Size selection was 
performed to exclude fragments <1.5kb. Library 
constructed by Dr. Sumio Sugano (University of Tokyo 
Institute of Medical Science) . Custom primers for 
sequencing: 5 1 end primer CTTCTGCTCTAAAAGCTGCG and 3' end 
primer C GAC CT G C AGCT C GAG CACA . " 
127 a 154 c 175 g 185 t 1 others 



Query Match 56.0%; Score 289.6; DB 9; Length 642; 

Best Local Similarity 78.9%; Pred. No. 2.7e-66; 

Matches 397; Conservative 0; Mismatches 95; Indels 11; Gaps 4; 



Qy 2 GACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGACA 61 

I I I I I I I I I M I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I M I II I I I 
Db 503 GACCAGAGACCCTTTGCGGGGCTGAGCTGGTGGATGCTCTTCAGGTCGTGTGTGGACCGA 444 

Qy 62 GGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAGA 121 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I 

Db 443 GGGGCTTTTTCTTCAACAAGGCCACAGGCTATGGCTCCAGCATTTGGAGGGCACCTCAGA 384 

Qy 122 CAGGCATCGT GGATGAGTGCT GCTT CCGG- AGCT GT GAT CTAAGGAGGCTGGAGATGTAT 180 

Ml M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II Ml I I I I I I I 
Db 383 C AGT CAAT GT GGAT GAGT GTTGCTTCC GGAAGCT GT GAT CT GAGAAGACTGN AGAT GT AC 324 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 24 0 

II II M M I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Ml 

Db 323 TGTGCCCCACTGAAGCCTACAAAAGCAGCCCGCTCTATCCGTGCCCAGCGCCACACTGAC 264 

Qy 241 AT GCCCAAGACCCAGAAGTAT CAGCCCCCATCTACCAACAAGAACACGAAGT CTCA G 297 

I I I I M I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I II | 
Db 263 AT GCCCAAGACT CAGAAGT CCC CGTCCCTAT CGACAAACAAGAAAACGAAGCT GCAAAGG 204 

Qy 2 98 AGAAGGAAAGGAAGT ACATTT GAAGAAC ACAAGT AGAGGGAGT G C AGGAAACAAGAACT A 357 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Ml 
Db 203 AGAAGGAAAGGAAGT AC AT T T GAAGAAC ACAAGT AGAGGAAGT GCAGGAAACAAGACCT A 144 

Qy 358 C AGGAT GT A- GAAGAC CCT T CT GAGGAGT GAAGAAGGACAGGC CACCGCAGGAC C CTT T G 416 

I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 143 CAGAATGTAGGAGGAGCCTCCCACGGAGCAGAAAAT GCCACAT CACCGCAGGAT CCTTTG 84 

Qy 417 CTCT GCACAGTT ACCT GTAAAC ATT GGAAT AC C GGCCA AAAAAT AAGT T T GAT C 470 

II I I I I I I I I I I I I I III I I I I II I I I II 

Db 8 3 CT GCT T GAGCAACCT GCAAAACAT CGAAAC AC CT AC CAAATAACAATAATAAGT C CAAT A 24 

Qy 471 ACAT T T CAAAGAT GGC AT T T C CC 493 

I I I I I I I I I I I II I II II 
Db 23 ACATTACAAAGATGGGCATTTCC 1 



RESULT 14 

AW495481/c 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
COMMENT 



AW495481 499 bp mRNA linear EST 24-FEB-2000 

UI-M-BH3-auy-g-ll-0-UI.sl NIH_BMAPjM_S4 Mus musculus cDNA clone 
UI-M-BH3-auy-g-ll-0-UI 3*, mRNA sequence. 
AW495481 

AW495481. 1 GI : 70657 62 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Rodentia; 
1 (bases 1 to 499) 

Bonaldo, M. F. , Lennon,G. and Soares,M.B. 

Normalization and subtraction: two approaches to facilitate gene 
discovery 

Genome Res. 6 (9), 791-806 (1996) 
97044477 
8889548 

Contact: Chin, H 



Craniata; Vertebrata; Euteleostomi ; 
Sciurognathi; Muridae; Murinae; Mus. 



National Institute of Mental Health 

6001 Executive Blvd. Room 7N-7190, MSC 9643, Bethescia, MD 

20892-9643, USA 

Tel: 301 443 1706 

Fax: 301 443 9890 

Email: mEST@mail.nih.gov 

The sequence contained an oligo-dT track that was present in the 
oligonucleotide that was used to prime the synthesis of first 
strand cDNA and therefore this may represent a bonafide poly A 
tail. The sequence tag present in the cDNA between the NotI site 
and the oligo-dT track served to identify it as a clone from the 
normalized pineal glands library cDNA Library Preparation: M.B. 
Soares Lab Clone distribution: Researchers may obtain BMAP cDNA 
clones from RESEARCH GENETICS. It should be noted that Bento Soares 
is generating a small number of additional specialized 
non-redundant arrays of BMAP cDNAs whose availability will be 
considered under appropriate and limited collaborative arrangements 
Seq primer: M13 Forward 
POLYA-Yes. 

FEATURES Location/Qualifiers 
source 1. .499 

/organism— "Mus musculus" 
/mol_type= ,, mRNA M 
/strain= M C57BL/6J" 
/db_xref="taxon: 10090" 
/clone= ,, UI-M-BH3-auy-g-ll-0-UI" 
/dev_stage="27-32 days" 
/lab_host="DH10B (Life Technologies)" 
/clone_lib="NIH_BMAP_M_S4 " 

/note="Vector : pT7T3D-Pac (Pharmacia) with a modified 
polylinker; Site_l: Not I; Site_2 : Eco RI ; The 
NIH_BMAP_M_S4 library is a subtracted library of a series, 
ultimately derived from a mixture of individually tagged 
normalized libraries from ten regions of the mouse brain 
(cerebellum, brain stems, olfactory bulbs, hypothalamus, 
cortex, amygdala, basal ganglia, pineal gland, striatum, 
hipoccampus) after a series of subtractions to reduce the 
representation of cDNAs from which ESTs had already been 
generated. The following serially subtracted libraries 
were generated in this process: NIH_BMAP__M_S4 , 
NIH_BMAP_M_S3.3, NIH_BMAP_M_S3 . 2 , NIH_BMAP_M_S3 . 1, 
NIH_BMAP_M_S2, NIH_BMAP_M_S1 . The subtracted library 
(NIH_BMAP_M_S4) was constructed as follows: PCRamplified 
cDNA inserts from NIH_BMAP_M_S3 . 3 , NIH_BMAP__M_S3 . 2, and 
NIH_BMAP_M_S3 . 1 clones from which 3' ESTs had been derived 
was used as a driver in a hybridization with a pool of 
the NIH_BMAPJM_S3.3, NIH_BMAP_M__S3 . 2 , and NIH_BMAP_M_S3 . 1 
libraries in the form of single-stranded circles. The 
remaining single-stranded circles (subtracted library) 
was purified by hydroxyapatite column chromatography, 
converted to double-stranded circles and electroporated 
into DH10B bacteria (Lif eTechnologies ) to generate the 
NIH_BMAP_M_S4 library. This procedure has been previously 
described (Bonaldo, Lennon and Soares, Genome Research 
6:791-806, 1996) 
T AG_L IB=NI H_BMAP_M_S 4 
TAG_TISSUE=pineal-glands 



TAG_SEQ=CAGAC" 
BASE COUNT 86 a 112 c 124 g 177 t 

ORIGIN 

Query Match 55.6%; Score 2 87.4; DB 9; Length 499; 

Best Local Similarity 80.8%; Pred. No. 9.7e-66; 

Matches 387; Conservative 0; Mismatches 81; Indels 11; Gaps 4; 

Qy 50 TGTGTGGAGACAGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGA 109 

I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I II I I II I I I I I I I I I I 
Db 499 TGTGTGGACCGAGGGGCTTTTACTTCAACAAGCCCACAGGCTATGGCTCCAGCATTCGGA 440 

Qy 110 GGGCGCCTCAGACAGGCATCGTGGATGAGTGCTGCTTCCGGAGCTGTGATCTAAGGAGGC 169 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I M I I I I I I I I I I 
Db 439 GGGCAC CT CAGAC AGGCAT T GT GGAT GAGT GTT GCT T CC GGAGCT GT GAT CT GAGGAGAC 380 

Qy 17 0 TGGAGATGTATTGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGC 22 9 

I I I I I I I I I I II II II II I I I I I I I II | | | | | | | | | | I I I I I I I I I I I I 
Db 37 9 TGGAGATGTACTGTGCCCCACTGAAGCCTACAAAAGCAGCCCGCTCTATCCGTGCCCAGC 320 

Qy 230 GC CAC ACC GACAT GC C CAAGAC C CAGAAGT ATC AGC C C CCAT CT AC CAACAAGAAC AC GA 2 89 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Ml Ml || I I I I I I I I I I I I 

Db 319 G C CAC ACT GACAT GC C CAAGACT CAGAAGT C CC C GT C C CTAT C GACAAACAAGAAAAC GA 260 

Qy 290 AGTCTCA GAGAAG GAAAGGAAGTAC AT T T GAAGAAC ACAAGT AGAGGGAGT GC AGGA 346 

II M I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 259 AGCT GCAAAG GAGAAGGAAAGGAAGT AC AT TTGAAGAAC ACAAGT AGAGGAAGT GC AGGA 200 

Qy 347 AACAAGAACTACAGGAT GT A- GAAGAC CCT T CT GAGGAGT GAAGAAGGAC AGGC CAC C GC 405 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 199 AACAAGAC CT ACAGAAT GT AGGAGGAGC CT CC C AC G GAGCAGAAAAT GC CAC AT CAC C GC 140 

Qy 406 AGGACCCTTTGCTCTGCACAGTTACCTGTAAACATTGGAATACCGGCCA AAAAA 4 59 

I I I I I I I I I I I I I I I I II I I I I I I I III I I I I 

Db 139 AGGAT CCTTTGCTGCTT GAGCAAC CT GCAAAACAT C GAAAC AC CT AC CAAATAACAATAA 8 0 

Qy 460 T AAGTTT GAT CAC ATTT CAAAGAT - GGC AT TT C C C C CAAT GAAAT AC ACAAGT AAACAT 517 

I I I I I M I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
Db 7 9 TAAGT C CAATAACAT T ACAAAGAT GGGC AT TT C C C C CAAT GAAAT AT ACAAGTAAACAT 21 



RESULT 15 

AI169770/C 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



AI169770 468 bp mRNA linear EST 20-JAN-1999 

EST215669 Normalized rat liver, Bento Soares Rattus sp. cDNA clone 
RLIAT07 3' end, mRNA sequence. 
AI169770 

AI169770.1 GI:3709810 
EST. 

Rattus sp. 
Rattus sp, 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; 
Rattus . 

1 (bases 1 to 468) 

Lee,N.H. , Glodek,A., Chandra, I., Mason, T.M., Quackenbush, J. , 
Kerlavage, A. R. and Adams, M.D. 



TITLE 

JOURNAL 
COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



Rat Genome Project: Generation of a Rat EST (REST) Catalog & Rat 

Gene Index 

Unpublished 

Other_ESTs: TC50779 

Contact: Lee, NH 

The Institute for Genomic Research 

9712, Medical Center Drive, Rockville, MD 20850, USA 
Tel: (301) -838-3529 
Fax: (301) -838-0208 
Email: nhlee@tigr.org 
Seq primer: M13-21. 

Location/ Qualifiers 

1. .468 

/organism= ,f Rattus sp." 
/mol_t ype= "mRNA" 

/db_xref="ATCC (inhost) : 2027570" 
/db_xref-"taxon: 10118" 
/clone="RLIAT07" 

/clone_lib="Normalized rat liver, Bento Scares" 
/note="Organ: liver; Vector: pT7T3Pac; Site_l: EcoRI; 
Site_2: Notl" 
85 a 115 c 119 g 149 t 



Query Match 53.4%; Score 276; DB 9; Length 468; 

Best Local Similarity 80.5%; Pred. No. le-62; 

Matches 375; Conservative 0; Mismatches 80; Indels 11; 



Gaps 



4; 



Qy 



Db 



63 GGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAGAC 122 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I 
4 68 GGGCTTTTACTTCAACAAGCCCACAGGCTATGGCTCCAGCATTCGGAGGGCACCACAGAC 409 



Qy 



Db 



123 AGGCATCGTGGATGAGT GCT GCTTCCGGAGCT GT GAT CTAAGGAGGCT GGAGAT GTATTG 182 

Mill I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
4 08 GGGCAT T GT GGAT GAGT GTT GCT CCCGGAGCT GT GAT CT GAGGAGGT T GGAGAT GT ACT G 34 9 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 



183 CGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGACAT 242 

II M M I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I 
348 TGCTCCGCTGAAGCCTACAAAGTCAGCTCGTTCCATCCGGGCCCAGCGCCACACTGACAT 289 

243 GCCCAAGACCCAGAAGTAT CAGCCCCCATCTACCAACAAGAACACGAAGT CT CA GAG 299 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I || Ml 
28 8 GCCCAAGACTCAGAAGT CCCAGCCCCT AT CGACACACAAGAAAAGGAAGCT GCAAAGGAG 229 

300 AAGGAAAGGAAGT AC AT T T GAAGAAC ACAAGT AGAGGGAGT GCAGGAAACAAGAACTACA 359 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
228 AAGGAAAGGAAGT ACACTT GAAGAACACAAGTAGAGGAAGT GCAGGAAACAAGAC CTACA 169 

3 60 GGAT GT A- GAAGAC C CT T CT GAGGAGT GAAGAAGGACAG GC C AC C GC AGGAC C CT T T GCT 418 

I I I I I I M M Ml I I M I I I II I II I I I I I I I I II I I I I I I I I 

168 GAAT GT AGGAGGAGC CT C C CGAGGAACAGAAAAT GC C AC GT C AC CGCAAGAT C CT TT GCT 109 

419 CT GC ACAGT T AC CT GTAAACATT GGAAT AC C GGC C A AAAAAT AAGT T T GAT C AC 472 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 

108 GCTT GAGCAACCT GCAAAACATCGGAACACCT GCCAAATAT CAATAAT GAGTT CAATACC 49 



Qy 



473 ATTTCAAAGAT-GGCATTTCCCCCAATGAAATACACAAGTAAACAT 517 



I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 48 ATTT CAGAGAT GGGCATTT CCCTCAAT GAAATACACAAGTAAACAT 3 



Search completed: December 13, 2003, 07:29:47 
Job time : 1690.26 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2003 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 
Run on: 



Title: 

Perfect score : 
Sequence : 

Scoring table : 



Searched: 



December 13, 2003, 05:41:20 ; Search time 2309.97 Seconds 

(without alignments) 
9156.102 Million cell updates/sec 

US-09-852-261-1 
517 

1 ggaccggagacgctctgcgg tgaaatacacaagtaaacat 517 

I D ENT I T Y_NU C 

Gapop 10.0 , Gapext 1.0 

2888711 seqs, 20454813386 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 
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Database 



GenEmbl : * 
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gb_ba : * 
gb_htg : * 
gb_in : * 
gb_om : * 
gb_o v : * 
gb_pat : * 
gb_ph : * 
gb_pl : * 
gb_pr : * 
gb_ro : * 
gb_sts : * 
gb_s y : * 
gb_un : * 
gb_vi : * 
em_ba : * 
em_f un : * 
em_hum: * 
em_in : * 
em_mu : * 
em__om : * 
em_or : * 
em_o v : * 
em_pat : * 
em__ph : * 
em_pl : * 
em_ro : * 
em sts:* 
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htgo hum:* 
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Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score, distribution. 
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RESULT 1 
AX147742 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 

FEATURES 

source 



AX147742 
Sequence 1 
AX147742 
AX147742.1 



CDS 



517 bp 
from Patent WO0136483. 

GI:14346787 



DNA 



linear 



PAT 31-AUG-2001 



BASE COUNT 
ORIGIN 



Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 

Goldspink, G. R. and Johnson, I. R. 

Use of the insulin-like-growth factor i isoform mgf for the 
treatment of neurological disorders 
Patent: WO 0136483-A 1 25-MAY-2001; 
University College London (GB) 

Location/Qualifiers 

1. .517 

/organism-"Homo sapiens" 
/ mol_type=" genomic DNA" 
/db_xref="taxon: 9606" 
<1. .333 

/note= "unnamed protein product" 
/ codon_start=l 
/protein_id="CAC41175 . 1" 
/db_xref-"GI : 14346788" 
/ db_xr e f = " REMTREMBL : CAC4 1175" 

/translation="GPETLCGAELVDALQFVCGDRGFYFNKPTGYGSSSRRAPQTGIV 
DECCFRSCDLRRLEMYCAPLKPAKSARSVRAQRHTDMPKTQKYQPPSTNKNTKSQRRK 
GSTFEEHK" 
150 a 130 c 139 g 98 t 



Query Match 100.0%; Score 517; DB 6; Length 517; 

Best Local Similarity 100.0%; Pred. No. 4.4e-155; 

Matches 517; Conservative 0; Mismatches 0; Indels 0; 



Gaps 



0; 



Qy 



GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 
I I I I I I I I I I I I I I I I I I M I I 1 I I I I I I I I I I I I I I I I I I I I II I I I ! I I I I I I I I I I I 



60 



Db 



1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 



Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

Qy 121 ACAGGCATCGTGGAT GAGT GCT GCTT C CGGAGCT GT GAT CTAAGGAGGCT GGAGATGTAT 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 ACAGGCATCGTGGAT GAGT GCT GCTTCC GGAGCT GTGAT CTAAGGAGGCT GGAGAT GTAT 180 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 24 0 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 24 0 

Qy 241 AT GC C CAAGAC CCAGAAGTAT CAGC C CC CAT CT AC CAACAAGAACAC GAAGT CTC AGAGA 300 

I I I I I I I I I I I I I I I I I I I II II I II II I I I I I I I M II I I I I I I II I I I I I I I I I I I I I 

Db 241 AT GC C CAAGAC CCAGAAGTAT C AGC C C C CAT CT AC CAACAAGAACAC GAAGT CT CAGAGA 300 

Qy 301 AG GAAAGGAAGTACAT T T GAAGAACACAAGT AGAGGGAGT GC AGGAAACAAGAACT ACAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 AGGAAAGGAAGT ACAT T T GAAGAACACAAGT AGAGGGAGT GCAGGAAACAAGAACT ACAG 360 

Qy 361 GAT GT AGAAGAC C CT T CT GAGGAGT GAAGAAGGAC AGGC C AC CGCAGGAC CCT T T GCT CT 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 GAT GT AGAAGACC CT T CT GAGGAGT GAAGAAGGAC AGGC C AC CG CAGGAC CCTT T GCT CT 42 0 

Qy 421 GCACAGTTAC CTGTAAACATT GGAATACCGGCCAAAAAATAAGTTT GATCACATTTCAAA 480 

I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 421 GC ACAGTT AC CT GTAAAC ATT GGAAT ACCGGCCAAAAAATAAGT TT GAT CACATT T CAAA 480 

Qy 4 81 GATGGCATTT CCCCCAAT GAAATACACAAGTAAACAT 517 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 481 GAT GGCATTT CCCCCAAT GAAATACACAAGTAAACAT 517 



RESULT 2 
AX300779 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



FEATURES 

source 



AX300779 517 bp DNA linear PAT 30-NOV-2001 

Sequence 1 from Patent WO0185781. 

AX300779 

AX300779.1 GI: 17382060 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 

Goldspink, G. D. and Terenghi, G. B . 

Repair of nerve damage 

Patent: WO 0185781-A 1 15-NOV-2001; 

University College London (GB) ; East Grinstead Medical Research 
Trust (GB) 

Location/Qualif iers 

1. .517 

/organism="Homo sapiens" 
/mo l_type=" genomic DNA" 
/db xref="taxon: 9606" 



CDS <1. .333 

/note=" unnamed protein product" 
/ codon_start=l 
/protein_id="CAD13040. 1" 
/db_xref="GI : 17382061" 

/translation="GPETLCGAELVDALQFVCGDRGFYFNKPTGYGSSSRRAPQTGIV 
DECCFRSCDLRRLEMYCAPLKPAKSARSVRAQRHTDMPKTQKYQPPSTNKNTKSQRRK 
GSTFEEHK" 

BASE COUNT 150 a 130 c 139 g 98 t 

ORIGIN 

. Query Match 100.0%; Score 517; DB 6; Length 517; 

Best Local Similarity 100.0%; Pred. No. 4.4e-155; 

Matches 517; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 AGGGGCTTTTATTTC7\AC7\AGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 12 0 

Qy 121 ACAGGC AT CGT G GAT GAGT G CT GCT T C C GGAG CT GT GAT CTAAGGAGGCT GGAGAT GT AT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
Db 121 ACAGGCATCGTGGATGAGTGCTGCTTCCGGAGCTGTGATCTAAGGAGGCTGGAGATGTAT 180 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 24 0 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 24 0 

Qy 241 ATGCCCAAGACCCAGT^GTATCAGCCCCCATCTACCAACAAGAACACGAAGTCTCAGAGA 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I H I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2 41 AT G CC CAAGAC C C AGAAGT AT CAGC C C CC ATCT AC CAACAAGAAC AC GAAGT CT CAGAGA 300 

Qy 3 01 AGGAAAGGAAGTACATTTGAAGAACACAAGTAGAGGGAGTGCAGGAAACAAGAACTACAG 360 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 AGGAAAGGAAGT ACATT T GAAGAAC ACAAGTAGAGGGAGT GCAGGAAACAAGAACT AC AG 360 

Qy 361 GAT GT AGAAGAC C CTTCT GAGGAGT GAAGAAG GAC AGGC C AC C GCAGGAC C CT T T GCT CT 42 0 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 GAT GT AGAAGAC C CTT CT GAGGAGT GAAGAAGGACAGGC CAC C GCAGGAC CCT T T GCT CT 420 

Qy 421 GCACAGTTACCTGTAAACATTGGAATACCGGCCAAAAAATAAGTTTGAT CACATTT CAAA 480 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 GCACAGTTACCTGTAAACATTGGAATACCGGCCAAAAAATAAGTTTGATCACATTTCAAA 48 0 

Qy 481 GAT GGCATTT CCCCCAAT GAAAT ACACAAGTAAACAT 517 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 GAT GGCATTT CCCCCAAT GAAAT ACACAAGTAAACAT 517 



RESULT 3 
AX147746 

LOCUS AX147746 523 bp DNA linear PAT 31-AUG-2001 

DEFINITION Sequence 5 from Patent WO01364 83. 
ACCESSION AX147746 



VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 

FEATURES 

source 



CDS 



BASE COUNT 
ORIGIN 



AX147746. 1 GI: 14346791 

Oryctolagus cuniculus (rabbit) 
Oryctolagus cuniculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

Mammalia; Eutheria; Lagomorpha; Leporidae; Oryctolagus. 

1 

Goldspink, G. R. and Johnson, I. R. 

Use of the insulin-like-growth factor i isoform mgf for the 
treatment of neurological disorders 
Patent: WO 0136483-A 5 25-MAY-2001; 
University College London (GB) 

Location/ Qualifiers 

1. .523 

/organism="Oryctolagus cuniculus " 
/mo l_type= "genomic DNA" 
/db_xref="taxon: 9986" 
<1. .336 

/note= "unnamed protein product" 
/ codon_start=l 
/protein_id="CAC41177 .1" 
/db_xref="GI : 14 34 6792" 
/ db_x r e f = " REMTREMBL : CAC 41177" 

/ trans la tion="GPETLCGAELVDALQFVCGDRGFYFNKPTGYGSSSRRAPQTGIV 
DECCFRSCDLRRLEMYCAPLKPAKAARSVRAQRHTDMPKTQKYQPPSTNKKMKSQRRR 
KGSTFEEHK" 
154 a 129 c 142 g 98 t 



Query Match 90.4%; 
Best Local Similarity 96.2%; 
Matches 501 ; Conservative 



Score 4 67.4; DB 6; 
Pred. No. 4.4e-139; 
0; Mismatches 16; 



Length 523; 
Indels 4; 



Gaps 



2; 



Qy 

Db 



GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
GGACCGGAGACGCTCTGCGGTGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I I I II M I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II 
61 AGGGGCTTTTATTTCAACAAGCCCACAGGATACGGCTCCAGCAGTCGGAGGGCACCTCAG 120 

121 AC AG GCAT C GT GGAT GAGT GCT GCT T C C GGAGCT GT GATCTAAGGAGGCT GGAGAT GT AT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

121 AC AGGCAT C GT GGAT GAGT GCTGCTTC C GGAGCT GT GATCT GAGGAGGCT GGAGAT GT AC 18 0 

181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 24 0 

II I I I I I I I I I II I I I II III I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I 

181 TGTGCACCCCTCAAGCCGGCAAAGGCAGCCCGCTCCGTCCGTGCCCAGCGCCACACCGAC 2 40 

241 AT GC C CAAGAC C CAGAAGT AT C AGCC C C CAT CT AC CAACAAGAAC AC GAAGT CT C A G 2 97 

I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
241 AT GC C CAAGACT CAGAAGT AT CAGC CT C CAT CT AC CAACAAGAAAAT GAAGT CT C AGAG G 300 



Qy 

Db 



298 
301 



357 
360 



Qy 358 CAGGATGTA- GAAGACCCTT CT GAGGAGT GAAGAAGGACAGGCCACCGCAGGACCCTTT G 416 

I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 CAGGAT GTAGGAAGACCCTT CT GAGGAGT GAAGAAGGACAGGC C ACC GC AGGACC CTT T G 420 

Qy 417 CT CT GCACAGTT AC CT GTAAAC ATT GGAAT AC C GG CCAAAAAAT AAGT T T GAT C AC AT TT 476 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 421 CT CT GCACAGT T AC C T GTAAAC AT T GGAAT AC C GGCCAAAAAATAAGT TT GAT CAC ATT T 480 

Qy 477 CAAAGATGGCATTT C CCCCAAT GAAATACACAAGTAAACAT 517 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 481 CAAAGAT GGC AT TT C CC C CAAT GAAATACACAAGTAAACAT 521 



RESULT 4 
AX300783 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



FEATURES 

source 



CDS 



BASE COUNT 
ORIGIN 



AX300783 523 bp DNA linear PAT 30-NOV-2001 

Sequence 5 from Patent WO0185781. 

AX300783 

AX3007 83. 1 GI : 17382 064 

Oryctolagus cuni cuius (rabbit) 
Oryctolagus cuniculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

Mammalia; Eutheria; Lagomorpha; Leporidae; Oryctolagus. 

1 

Goldspink,G.D. and Terenghi, G. B . 

Repair of nerve damage 

Patent: WO 0185781-A 5 15-NOV-2001; 

University College London (GB) ; East Grinstead Medical Research 
Trust (GB) 

Location/ Qualifiers 

1. .523 

/organism="Oryctolagus cuniculus " 
/mol__type=" genomic DNA" 
/db_xref="taxon: 9986" 
<1. .336 

/note="unnamed protein product" 
/codon_start=l 
/protein_id="CAD13042. 1" 
/db_xref-"GI : 17382065" 

/ translation="GPETLCGAELVDALQFVCGDRGFYFNKPTGYGSSSRRAPQTGIV 
DECCFRSCDLRRLEMYCAPLKPAKAARSVRAQRHTDMPKTQKYQPPSTNKKMKSQRRR 
KGSTFEEHK" 
154 a 129 c 142 g 98 t 



Query Match 90.4%; 
Best Local Similarity 96.2%; 
Matches 501; Conservative 



Score 467.4; DB 6; 
Pred. No. 4.4e-139; 
0; Mismatches 16; 



Length 523; 
Indels 4; 



Gaps 



Qy 



Db 



1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1 GGACCGGAGACGCTCTGCGGTGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 



Qy 



Db 



61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I 
61 AGGGGCTTTTATTTCAACAAGCCCACAGGATACGGCTCCAGCAGTCGGAGGGCACCTCAG 120 



Qy 121 ACAGGCATCGT GGAT GAGTGCT GCTTCCGGAGCT GTGAT CTAAGGAGGCT GGAGATGTAT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 AC AG G CAT C GT GGAT GAGT GCT GCTT C C GGAGCT GTGAT CT GAG GAGGCT G GAGAT GT AC 18 0 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

II I I I I I I I I I I I I I I II III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i 

Db 181 TGTGCACCCCTCAAGCCGGCAAAGGCAGCCCGCTCCGTCCGTGCCCAGCGCCACACCGAC 2 40 

Qy 241 AT GC C CAAGAC CCAGAAGT AT CAGCC C C CAT CT AC CAACAAGAAC AC GAAGT CT CA G 297 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 AT GC C CAAGACT CAGAAGT AT CAGCCT C CAT CT AC CAACAAGAAAAT GAAGT CT C AGAGG 300 

Qy 298 AGAAGGAAAGGAAGTACATTT GAAGAACACAAGT AGAGGGAGT GCAGGAAACAAGAACTA 357 

I I I I I M I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 301 AGAAGGAAAGGAAGTACATTT GAAGAACACAAGT AGAGGGAGT GCAGGAAACAAGAACTA 360 

Qy 358 CAGGAT GTA- GAAGACCCTT CT GAGGAGT GAAGAAGGACAGGCCACCGCAGGACCCTTT G 416 

I I I I I I I I I II I I M I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 
Db 361 CAGGAT GTAGGAAGACCCTT CT GAGGAGTGAAGAAGGACAGGCCACCGCAGGACCCTTTG 420 

Qy 417 CTCT GCACAGTTAC CT GTAAACATT GGAATACCGGCCAAAAAATAAGTTTGATCACATTT 476 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 CTCT GCACAGTTACCT GTAAACATT GGAATACCGGCCAAAAAATAAGTTTGATCACATTT 4 80 

Qy 477 CAAAGAT GGCATTT CC C C CAAT GAAAT AC ACAAGT AAAC AT 517 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 481 CAAAGAT GGCATTT CC C C CAAT GAAAT AC ACAAGTAAACAT 521 



RESULT 5 
AX147754 
LOCUS 

DEFINITION 
ACCESSION 
VERSION 1 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 

FEATURES 

source 



CDS 



AX147754 471 bp DNA linear PAT 08-JUN-2001 

Sequence 13 from Patent WO0136483. 

AX147754 

AX147754.1 GI: 14348552 

Oryctolagus cuniculus (rabbit) 
Oryctolagus cuniculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Lagomorpha; Leporidae; Oryctolagus. 

1 

Goldspink, G. R. and Johnson, I. R. 

Use of the insulin-like-growth factor i isoform mgf for the 
treatment of neurological disorders 
Patent: WO 0136483-A 13 25-MAY-2001; 
University College London (GB) 

Location/ Qualifiers 

1. .471 

/ organism="Oryctolagus cuniculus " 
/mol_type=" genomic DNA" 
/db_xref="taxon: 9986" 
<1. .318 

/note= "unnamed protein product" 
/ codon_start-l 
/protein_id="CAC41264 .1" 
/db xref="GI: 14348553" 



/translation="GPETLCGAELVDAiQFVCGDRGFYFNKPTGYGSSSRRAPQTGIV 
DECCFRSCDLRRLEMYCAPLKPAKAARSVRAQRHTDMPKTQKEVHLKNTSRGSAGNKN 
YRM" 

BASE COUNT 132 a 118 c 131 g 90 t 

ORIGIN 

Query Match 73.0%; Score 377.2; DB 6; Length 471; 

Best Local Similarity 87.8%; Pred. No. 5.4e-110; 

Matches 455; Conservative 0; Mismatches 13; Indels 50; Gaps 2; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 GGACCGGAGACGCTCTGCGGTGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 AGGG GCT T T TAT T T CAACAAGC C CACAGGAT AC GGCT C C AGC AGT CGGAGGGC AC CT C AG 120 

Qy 121 ACAGGCAT CGTGGAT GAGT GCT GCTTCCGGAGCT GT GAT CTAAGGAGGCT GGAGATGTAT 180 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I II I I I I 

Db 121 ACAGGCATCGTGGATGAGTGCTGCTTCCGGAGCTGTGATCTGAGGAGGCTGGAGATGTAC 180 

Qy 181 T GC GCACC C CT CAAG C CT GC CAAGT CAGCT C G CT CT GT C C GT GC C C AGC GC CACAC C GAC 240 

II I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 

Db 181 TGTGCACCCCTCAAGCCGGCAAAGGCAGCCCGCTCCGTCCGTGCCCAGCGCCACACCGAC 240 

Qy 241 AT GC C CAAGACC CAGAAGT AT C AGC C CCCAT CT ACCAACAAGAAC AC GAAGT CT CAGAGA 300 

I I I I I I I I I I I II I 

Db 241 ATGCCCAAGACTCAG : 255 

Qy 301 AGGAAAGGAAGT ACAT T T GAAGAACACAAGT AGAGGGAGT GC AGGAAACAAGAACT AC AG 360 

I I I II I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 256 AAGGAAGT ACAT T T GAAGAACACAAGT AGAGGGAGT GC AGGAAACAAGAACT AC AG 311 

Qy 361 GAT GT A- GAAGACC CT T CT GAGGAGT GAAGAAGGAC AGGC CACCGCAGGACCCTT T GCT C 419 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 312 GAT GTAGGAAGACCCTT CTGAGGAGT GAAGAAGGACAGGCCACCGCAGGACCCTTT GCTC 371 

Qy 420 T GC AC AGT T ACCT GT AAACAT T GGAAT AC CGGC CAAAAAATAAGTTT GAT CACAT T T CAA 479 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 372 T GCACAGTTACCTGTAAACATT GGAAT ACCGGCCAAAAAATAAGTTT GAT CACATTTCAA 431 

Qy 480 AGAT GGCATTT CCCC CAAT GAAAT ACACAAGTAAACAT 517 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 432 AGAT GGC AT TT C CC C CAAT GAAAT ACACAAGTAAACAT 469 



RESULT 6 
AX300791 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



AX300791 
Sequence 
AX300791 
AX300791. 



13 



471 bp 
from Patent WO0185781. 



DNA 



linear 



PAT 30-NOV-2001 



1 GI:17382072 



Oryctolagus cuniculus (rabbit) 
Oryctolagus cuniculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



FEATURES 

source 



CDS 



BASE COUNT 
ORIGIN 



Mammalia; Eutheria; Lagomorpha; Leporidae; Oryctolagus . 
1 

Goldspink, G. D. and Terenghi, G. B . 

Repair of nerve damage 

Patent: WO 0185781-A 13 15-NOV-2001; 

University College London (GB) ; East Grinstead Medical Research 
Trust (GB) 

Location/Qualifiers 

1. .471 

/organism- "Oryctolagus cuni cuius" 
/mo l_type= "genomic DNA" 
/db_xref="taxon: 998 6" 
<1. .318 

/note="unnamed protein product" 
/codon_start=l 
/protein_id="CAD13045. 1" 
/db_xref="GI: 17382073" 

/ trans la tion="GPETLCGAELVDALQFVCGDRGFYFNKPTGYGSSSRRAPQTGIV 
DECCFRSCDLRRLEMYCAPLKPAKAARSVRAQRHTDMPKTQKEVHLKNTSRGSAGNKN 
YRM" 

132 a 118 c 131 g 90 t 



Query Match 73.0%; 
Best Local Similarity 87.8%; 
Matches 455; Conservative 



Score 377.2; DB 6; 
Pred. No. 5.4e-110; 
0; Mismatches 13; 



Length 471; 
Indels 50; 



Gaps 



2; 



Qy 

Db 



GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 
II I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
GGACCGGAGACGCTCTGCGGTGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 



61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
61 AGGGGCT T TT ATT T CAACAAGC C C ACAGGAT AC GGCTC CAGC AGT C GGAGG GCAC CT CAG 120 

121 AC AGGC AT C GT GGAT GAGT GCT GCT T C C GGAGCT GTGAT CTAAGGAGGCT GGAGAT GT AT 180 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

121 AC AGGC AT C GT GGAT GAGT GCT GCT T C C GGAGCT GT GAT CT GAGGAGGCT GGAGAT GTAC 180 

181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

II I I I I I I I I I I I I I I II III I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 

181 TGTGCACCCCTCAAGCCGGCAAAGGCAGCCCGCTCCGTCCGTGCCCAGCGCCACACCGAC 240 

241 AT GCCCAAGACCCAGAAGTATCAGCCCCCATCTACCAACAAGAACACGAAGTCT CAGAGA 300 
I I I I I I I I I I I I I I 

241 ATGCCCAAGACT CAG 255 

301 AGGAAAGGAAGT AC AT TT GAAGAAC ACAAGT AGAGGGAGT GCAGGAAACAAGAACT ACAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
256 AAGGAAGT ACAT TT GAAGAAC ACAAGT AGAGGGAGT GCAGGAAACAAGAACT ACAG 311 

361 GAT GT A- GAAGAC C CTT CT GAGGAGT GAAGAAGGACAGGC C AC C GC AG GAC CCT T T GCT C 419 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
312 GAT GT AG GAAGAC C CT T CT GAGGAGT GAAGAAGGACAGG C C AC C GC AGGACCCT T T GCT C 371 



Qy 



420 



T GCAC AGT T AC CT GTAAACAT T GGAAT ACC GGC CAAAAAATAAGT T T GATCACAT T T CAA 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



479 



Db 



372 T GCACAGT T AC CT GT AAAC AT T GGAAT AC C GGCCAAAAAATAAGTT T GAT CACAT T T CAA 431 



Qy 480 AG AT G G CAT T T C C C C C AAT G AAAT AC AC AAGT AAAC AT 517 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 432 AGAT GGC ATT T C C CC CAAT GAAAT ACACAAGT AAAC AT 469 



RESULT 7 
HSU40870 

LOCUS HSU40870 444 bp mRNA linear PRI 05-APR-1996 

DEFINITION Human alternatively spliced human insulin-like growth factor-I 

(IGF-I) mRNA, partial cds . 
ACCESSION U40870 
VERSION U40870.1 GI:1100902 

KEYWORDS 

SOURCE Homo sapiens (human) 

ORGANISM Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
REFERENCE 1 (bases 1 to 444) 

AUTHORS Chew,S.L., Lavender, P., Clark, A. J. and Ross, R.J. 

TITLE An alternatively spliced human insulin-like growth factor-I 

transcript with hepatic tissue expression that diverts away from 
the mitogenic IBE1 peptide 
JOURNAL Endocrinology 136 (5), 1939-1944 (1995) 
MEDLINE 95237119 
PUBMED 7720641 
REFERENCE 2 (bases 1 to 44 4) 
AUTHORS Chew, S.L. 
TITLE Direct Submission 

JOURNAL Submitted ( 20-NOV-1995 ) Shern L. Chew, Endocrinology, St 

Bartholomew's Hospital Medical College, West Smithfield, London, 
EclA 7Be, UK 
FEATURES Location/Qualifiers 
source 1. .444 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
/clone="pC4" 
/tissue_type=" liver" 
gene 1. .444 

/gene="IGF-I" 
CDS <1. .420 

/gene="IGF-I" 

/note="alternatively spliced; previously, exon 5 and 6 
were thought to be mutually exclusive; this transcript 
splices from exon 5 into exon 6; the alternatively spliced 
transcript would continue with exon 5 to the polyA signal" 
/codon_start=l 

/product="insulin-like growth factor-I" 
/protein__id="AAA96152 . 1" 
/db_xref="GI : 1100903" 

/ trans la tion="LKVKMHTMSSSHLFYLALCLLTFTSSATAGPETLCGAELVDALQ 
FVCGDRGFYFNKPTGYGSSSRRAPQTGIVDECCFRSCDLRRLEMYCAPLKPAKSARSV 
RAQRHTDMPKTQKYQPPSTNKNTKSQRRKGSTFEERK" 
exon 1 . .6 

/gene="IGF-I" 



exon 



exon 



exon 



BASE COUNT 
ORIGIN 



/ number=l 
7. .163 
/gene= M IGF-I" 
/ number=3 
164. .345 
/gene= M IGF-I M 
/ number =4 
346. .394 
/gene="IGF-I" 
/number =5 
395. .420 
/gene="IGF-I" 
/ number=6 
107 a 125 c 



125 g 



87 t 



Query Match 68.7%; 
Best Local Similarity 99.7%; 
Matches 356; Conservative 



Score 355.4; DB 9; 
Pred. No. 5.7e-103; 
0; Mismatches 1; 



Length 444; 
Indels 0; 



Gaps 



0; 



Qy 

Db 



88 



GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 147 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
14 8 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 207 

121 ACAGGCAT CGT GGAT GAGT GCT GCTT CCGGAGCTGTGATCTAAGGAGGCT GGAGAT GTAT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2 08 ACAGGCAT CGT GGAT GAGT GCT GCTT CCGGAGCTGTGATCTAAGGAGGCT GGAGAT GTAT 267 

181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 
I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2 68 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 327 

241 AT GCCCAAGACC CAGAAGT AT CAGCC CC CAT CTAC CAACAAGAAC AC GAAGT CT CAGAGA 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
328 AT GCCCAAGAC C CAGAAGT AT CAGCC C C CAT CTAC CAACAAGAACAC GAAGT CT CAGAGA 387 

301 AGGAAAGGAAGT ACAT TT GAAGAACACAAGT AGAGGGAGT GCAGGAAACAAGAACT A 357 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
388 AGGAAAGGAAGT ACAT T T GAAGAACGCAAGT AGAGGGAGT GCAGGAAACAAGAACT A 444 



RESULT 8 

HSIGF1A 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 



HSIGF1A 616 bp 

H. sapiens mRNA for IGF-la. 
X56773 S61841 
X56773.1 GI:32989 
IGF-1 gene. 
Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 
1 (bases 1 to 616) 



mRNA 



linear 



PRI .29-NOV-1993 



Craniata; Vertebrata; Euteleostomi; 
Catarrhini; Hominidae; Homo. 



AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
PUBMED 
FEATURES 

source 



gene 



CDS 



mat_peptide 



BASE COUNT 
ORIGIN 



Sandberg-Nordqvist, A. C. , Stahlbom, P . A. , Lake,M. and Sara,V.R. 
Characterization of two cDNAs encoding insulin-like growth factor 1 
(IGF-1) in the human fetal brain 

Brain Res. Mol . Brain Res. 12 (1-3), 275-277 (1992) 

92186627 

1372070 

2 (bases 1 to 616) 
Sandberg Nordqvist , A. C . 
Direct Submission 

Submitted ( 19-NOV-1990) A. C. Sandberg Nordqvist, KAROLINSKA INST 1 S 
DEPT OF PATHOLOGY, KAROLINSKA HOSPITAL, BOX 605 00, S-104 01 
STOCKHOLM, SWEDEN 

3 (bases 1 to 616) 

Sandberg-Nordqvist, A. C. , Stahlbom, P .A. , Reinecke,M. , Collins, V. P. , 
von Hoist, H. and Sara,V. 

Characterization of insulin-like growth factor 1 in human primary 
brain tumors 

Cancer Res. 53 (11), 2475-2478 (1993) 

93265440 

8495408 

Location/Qualifiers 
1. .616 

/organism="Homo sapiens" 

/mol_type="mRNA" 

/db_xref="taxon: 9606" 

/ ch r omo s ome= "12" 

/map="q22-q24" 

/tissue_type="brain" 

/ dev_stage=" fetal" 

1. .462 

/gene="IGF-l" 

1. .462 

/gene="IGF-l" 

/codon_start=l 

/product="IGF-la" 

/protein_id="CAA40092 .1" 

/db_xref="GI : 32990" 

/db_xref="SWISS-PROT:P01343" 

/ trans la tion="MGKISSLPTQLFKCCFCDFLKVKMHTMSSSHLFYLALCLLTFTS 
SATAGPETLCGAELVDALQFVCGDRGFYFNKPTGYGSSSRRAPQTGIVDECCFRSCDL 
RRLEMYCAPLKPAKSARSVRAQRHTDMPKTQKEVHLKNASRGSAGNKNYRM" 
145. .354 
/gene="IGF-l" 
/product="IGF-la" 
403. .616 
/note="exon 5" 
159 a 158 c 160 g 139 t 



Query Match 66.6%; 
Best Local Similarity 87.3%; 
Matches 455; Conservative 



Score 344.2; DB 9; 
Pred. No. 2.5e-99; 
0; Mismatches 13; 



Length 616; 
Indels 53; 



Gaps 



5; 



Qy 

Db 



1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 
I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
145 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 204 



Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 205 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 2 64 

Qy 121 ACAGGCAT C GT G GAT GAGT GC T GCT T C C GGAGCT GT GAT C TAAGGAGGCT G GAGAT GT AT 180 

I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
Db 265 ACAGGCAT CGT GGAT GAGT GCTGCTT C C GGAGCT GTGAT CTAAGGAGGCT GGAGAT GTAT 324 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 24 0 

I I I I M I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I 

Db 325 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 384 

Qy 241 ATGCCCAAGACCCAGAAGTAT CAGCCCCCATCTACCAACAAGAACACGAAGTCT CAGAGA 300 

I I I I I I I I I I I I I I I 

Db 385 AT GC C CAAGAC C C AG 399 

Qy 301 AGGAAAG GAAGT AC ATT T GAAGAACACAAGT AGAGGGAGT GCAGGAAACAAGAACT ACAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 400 AAGGAAGTACATTTGAAGAACGCAAGTAGAGGGAGTGCAGGAAACAAGAACTACAG 455 

Qy 361 GATGTA- GAAGACC CTT CT GAGGAGT GAAGAAGGACAGGC C AC C GC AGGAC C CT TT GCT C 419 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 456 GATGT AGGAAGACCCT C CT GAGGAGT GAAGAGT GACAT GC C AC C GCAGGAT CCT T T GCT C 515 

Qy 420 T GCAC- AGTTAC CT G- T AAAC AT T GGAAT AC C GGC CAAAAAAT AAGTT T GAT CAC ATT T C 477 

I I I I I I I II I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I 
Db 516 T GCACGAGTTAC CT GT T AAACTT T GGAAC AC CT AC CAAAAAAT AAGTT T GATAAC ATT T A 575 

Qy 478 AAAGAT - GGCATTT C CC C CAAT GAAAT AC ACAAGTAAAC AT 517 

I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 576 AAAGAT GGGC GT T T C C C C CAAT GAAAT AC ACAAGTAAACAT 616 



RESULT 9 
AX375028 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 



BASE COUNT 
ORIGIN 



AX375028 7260 bp DNA linear PAT 01-MAR-2002 

Sequence 31 from Patent WO0210436. 

AX375028 

AX375028.1 GI: 19169860 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 

Baak,J. and Mutter, G.L. 

Prognostic classification of breast cancer 
Patent: WO 0210436-A 31 07-FEB-2002; 

THE BRIGHAM AND WOMEN 1 S HOSPITAL, INC. (US) ; Baak, Jan (US) 
Location/Qualifiers 
1. .7260 

/organism= M Homo sapiens" 
/mol_type=" genomic DNA" 
/db_xref="taxon: 9606" 
2330 a 1415 c 1240 g 2275 t 



Query Match 66.6%; Score 344.2; DB 6; Length 7260; 

Best Local Similarity 87.3%; Pred. No. 3.6e-99; 

Matches 455; Conservative 0; Mismatches 13; Indels 53; Gaps 5; 



Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 311 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 370 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
Db 371 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 4 30 

Qy 121 ACAGGCAT C GT GGAT GAGT GCT GCT T C C GGAGCT GT GAT CT AAGGAGGCT GGAGAT GT AT 180 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 431 ACAGGCATCGTGGATGAGTGCTGCTTCCGGAGCTGTGATCTAAGGAGGCTGGAGATGTAT 4 90 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 
Db 491 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 550 

Qy 241 AT GC C CAAGAC C C AGAAGTAT CAGC C C C CAT CT ACCAACAAGAACACGAAGT CT CAGAGA 300 

I II I I I I I I I I I I I I 

Db 551 AT G C C CAAGAC C C AG 565 

Qy 301 AGGAAAGGAAGT ACAT T T GAAGAACACAAGT AGAGGGAGT G C AGGAAACAAGAACTAC AG 360 

I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 566 AAGGAAGT AC AT T T GAAGAAC GCAAGT AGAGGGAGT GCAGGAAACAAGAACTACAG 621 

Qy 361 GAT GT A- GAAGACC CT T CT GAGGAGT GAAGAAGGACAGGC CACCGCAGGACCCTT T GCT C 419 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 622 GAT GT AGGAAGAC C CT C CT GAGGAGT GAAGAGT GACAT GC CAC C GCAGGATCCT TT GCT C 681 

Qy 420 T GCAC - AGT T AC CT G - T AAACAT T GGAAT AC CGGCCAAAAAATAAGTTT GAT CACAT T T C 477 

I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I 
Db 682 T GCAC GAGT T AC CT GT T AAACTT T GGAAC AC CT ACCAAAAAATAAGTTT GATAAC AT TTA 741 

Qy 478 AAAGAT - GGCAT T T CC C CCAATGAAAT AC ACAAGTAAAC AT 517 

I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 742 AAAGAT GGGCGTTTCCCC C AAT GAAAT AC AC AAGT AAACAT 782 



RESULT 10 

AX411095 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



AX411095 7260 bp DNA 

Sequence 3742 from Patent WO0229103. 
AX411095 

AX411095. 1 GI: 214 43800 



linear PAT 14-JUN-2002 



Homo sapiens (human) 
Homo sapiens 
Eukaryota; Metazoa; 
Mammalia; Eutheria; 
1 

Alvares,C, Horne,D. 



Chordata; Craniata; Vertebrata; Euteleostomi ; 
Primates; Catarrhini; Hominidae; Homo. 

, Peres-da-Silva, S . and Vockley, J. G. 



Gene expression profiles in liver cancer 
Patent: WO 0229103-A 3742 ll-APR-2002; 
GENE LOGIC INC (US) 



FEATURES 

source 



BASE COUNT 
ORIGIN 



Location/Qualifiers 
1, .7260 

/organism="Homo sapiens" 
/mol_type=" genomic DNA" 
/db_xref="taxon: 9606" 
/note= f, EMBL/GenBank Accession No. 
2330 a 1415 c 1240 g 2275 t 



X57025" 



Query Match 66.6%; Score 344.2; DB 6; Length 7260; 

Best Local Similarity 87.3%; Pred. No. 3.6e-99; 

Matches 455; Conservative 0; Mismatches 13; Indels 53; Gaps 



5; 



Qy 



Db 



1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
311 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 370 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 



61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
371 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 430 

121 AC AGGCAT C GT GGAT GAGTGCT GCT T C C GGAGCT GT GAT CT AAGGAGGCT GGAGAT GT AT 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
431 ACAGGCAT C GT GGAT GAGT GCT GCT T C C GGAGCT GT GAT CT AAG GAGGCT GGAGAT GT AT 4 90 

181 TGCGCACCCCTC7yVGCCTGCC7^GTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I IN I I I I I I I I I I I I I I I I I 
491 T G C GCACC C CT CAAGCCT GCCAAGT CAGCT CGCTCTGTCCGTGCC CAGCGCC ACAC C GAC 550 

241 AT GC CCAAGAC C CAGAAGTAT CAGC C C C CAT CT AC CAACAAGAACAC GAAGTCT C AGAGA 300 
I I I I I I I I I I I I I I I 

551 AT G C C C AAG AC C C AG 565 

301 AGGAAAGGAAGTACATTTGAAGAACACAAGTAGAGGGAGTGCAGGAAACAAGAACTACAG 360 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
566 AAGGAAGTACATTTGAAGAACGCAAGTAGAGGGAGTGCAGGAAACAAGAACTACAG 621 

361 GAT GT A- GAAGAC C CTTCT GAGGAGT GAAGAAGGACAGGC CACCGC AGGACCCTT T GCT C 419 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
622 GAT GT AGGAAGACCCTCCTGAGGAGT GAAGAGT GACATGCCACCGCAGGATCCTTTGCT C 681 

420 T GCAC - AGT T AC CT G - T AAACAT T GGAAT AC C GGC CAAAAAATAAGT TT GAT CACATTT C 477 

I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I 
682 T GCAC GAGT T AC CT GT T AAACTT T GGAAC AC CT AC CAAAAAATAAGT TT GAT AAC AT TT A 741 

478 AAAGAT - GGCAT TT C C CCCAAT GAAAT AC ACAAGT AAACAT 517 

I I II I I III I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
7 42 AAAGAT GGGCGTTTCCCC CAAT GAAAT AC AC AAGT AAACAT 7 82 



RESULT 11 

HSIGFACI 

LOCUS 

DEFINITION 
ACCESSION 
VERSION 
KEYWORDS 



HSIGFACI 7260 bp mRNA linear 

Human IGF- I mRNA for insulin-like growth factor I. 
X57025 

X57025.1 GI:33007 
insulin-like growth factor I . 



PRI 17-FEB-1992 



SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



FEATURES 

source 



gene 



mRNA 



exon 



CDS 



sig^peptide 



mat peptide 



exon 



Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 7260) 

Steenbergh, P.H. , Koonen-Reemst , A.M. , Cleut jens, C. B. and 
Sussenbach, J. S . 

Complete nucleotide sequence of the high molecular weight human 
IGF- I mRNA 

Biochem. Biophys . Res. Commun. 175 (2), 507-514 (1991) 

91207342 

2018498 

2 (bases 1 to 7260) 
Steenbergh, P.H. 
Direct Submission 

Submitted ( 18-DEC-1990) P.H. Steenbergh, LAB FOR PHYSIOLOGICAL 
CHEMISTRY, UNIVERSITY OF UTRECHT, VONDELLAAN 24 A, 3521 GG UTRECHT, 
THE NETHERLANDS 

Location/Qualifiers 

1. .7260 

/organism="Homo sapiens" 

/mol_type="mRNA" 

/db_xref="taxon: 9606" 

/chromosome="12 q22~24.1" 

/tissue_type="liver" 

/dev_stage="adult M 

1. .7260 

/gene-"IGF-I" 

1. .7260 

/gene="IGF-I" 

/ evidence=experimental 

1. .229 

/gene="IGF-I" 

/ number=l 

/ evidence=experimental 

167. .628 

/gene="IGF-I" 

/codon_start=l 

/ evidence=experimental 

/product="insulin-like growth factor I" 
/protein_id="CAA40342 . 1" 
/db_xref ="GI : 33008 " 
/db_xref="SWISS-PROT:P01343" 

/ trans la tion="MGKISSLPTQLFKCCFCDFLKVKMHTMSSSHLFYLALCLLTFTS 
SATAGPETLCGAELVDALQFVCGDRGFYFNKPTGYGSSSRRAPQTGIVDECCFRSCDL 
RRLEMYCAPLKPAKSARSVRAQRHTDMPKTQKEVHLKNASRGSAGNKNYRM" 
167. .310 



/gene="IGF-I" 

/ evidence=experimental 

311. .520 

/gene="IGF-I" 

/product =" insulin-like 

/ evidence=experimental 

230. .386 

/gene="IGF-I" 

/ number =2 

/evidence=experimental 



growth factor I * 



exon 



polyA_signal 



repeat_unit 



repeat_unit 



polyA__signal 



BASE COUNT 
ORIGIN 



2330 a 



387. .568 
/gene="IGF-I" 
/ number=3 

/ evidence=experimental 
569. .7236 
/gene="IGF-I" 
/ number =5 

/ evidence=experimental 

861. .865 

/gene="IGF-I" 

/note= M l.l kb mRNA" 

/ evidence=experimental 

3986. .4026 

/gene= M IGF-I M 

/note="CA-repeat" 

/ evidence=experimental 

5926. .6215 

/gene="IGF-I" 

/ evidence=experimental 

/ r p t_f ami 1 y= " Al u I " 

7205. .7210 

/gene="IGF-I" 

/note="7.6 kb mRNA" 

/ evidence=experimental 

1415 c 1240 g 2275 



Query Match 66.6%; 
Best Local Similarity 87.3%; 
Matches 455; Conservative 



Score 344.2; DB 9; 
Pred. No. 3.6e-99; 
0; Mismatches 13; 



Length 7260; 
Indels 53; 



Gaps 



5; 



Qy 



Db 



311 



GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 
I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I 
GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 370 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 
I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I II I I 
371 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 430 

121 AC AGGCAT C GT GGAT GAGT GCT GCT T C C GGAGCT GT GAT CTAAGGAGGCT GGAGAT GT AT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I 
431 AC AGGCAT CGT GGAT GAGTGCT GCT T C C GGAGCT GT GAT CTAAGGAGGCT G GAGAT GT AT 490 

181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 24 0 

I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
491 T GC GCAC C C CT CAAGC CT GCCAAGT CAG CT CGCTCTGTCCGTGCC CAGC GC CACACCGAC 550 

2 41 AT G C C CAAGAC C CAGAAGT AT CAGC C C C CAT CT AC CAACAAGAACAC GAAGT CT CAGAGA 300 
I I I II I I I I I I I I I I 

551 AT G C C CAAGAC C CAG 565 

301 AGGAAAGGAAGTACATTTGAAGAACACAAGTAGAGGGAGTGCAGGAAACAAGAACTACAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I II I I I I I I I I I I I I I I I 
566 AAGGAAGT AC AT T T GAAGAAC GCAAGT AGAGGGAGT GCAGGAAACAAGAACTAC AG 621 



Qy 



361 GAT GT A- GAAGAC C CT T CT GAGGAGT GAAGAAGGAC AGGC C AC CGC AGGACC CTTT GCT C 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



419 



Db 622 GAT GT AGGAAGAC C CT C CT GAGGAGT GAAGAGT GAC AT GC C AC C GCAGGAT CCTTTGCTC 681 

Qy 42 0 T GC AC - AGT T AC CT G- TAAACAT T GGAAT AC C GGC CAAAAAATAAGTT T GAT C AC ATTTC 477 

I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I 
Db 682 T GC AC GAGT T AC CT GTT AAACTT T GGAACAC CT AC CAAAAAATAAGT TT GAT AAC ATTTA 741 

Qy 478 AAAG AT - G G CAT T T C C C C CAAT GAAAT AC AC AAGT AAAC AT 517 

I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 742 AAAGAT G GG C GT T T C C C C CAAT GAAAT AC ACAAGT AAAC AT 782 



RESULT 12 

A29119 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



A29119 666 bp DNA linear 

H. sapiens IGF1 gene fragment from patent GB2241703. 
A29119 

A29119.1 GI:1247520 



PAT 15-JUN-1995 



REFERENCE 
AUTHORS 
JOURNAL 

FEATURES 

source 



Homo s apiens ( human ) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 
1 (bases 1 to 666 ) 



Craniata; Vertebrata; 
Catarrhini; Hominidae; 



Euteleostomi ; 
Homo . 



Patent: GB 2241703-A 3 ll-SEP-1991; 
Location/Qualifiers 
1. .666 
/organism="Homo sapiens" 
/mol_type= "genomic DNA" 
/db__xref="taxon: 9606" 
CDS 25. .384 

/partial 
/codon_start=l 
/product="IGF-l precursor" 
/protein_id="CAA01955. 1" 
/db_xref-"GI : 4529932" 

/ 1 rans la t ion= "MALCLLTFTS SATAGPETLCGAELVDALQFVCGDRGFYFNKPTG 
YGSSSRRAPQTGIVDECCFRSCDLRRLEMYCAPLKPAKSARSVRAQRHTDMPKTQKEV 
HLKNASRGSAGNKNYRM" 
mat_peptide 67. .276 

/product="IGF-l" 
173 a 167 c 181 g 145 t 



BASE COUNT 
ORIGIN 



Query Match 66.3%; 
Best Local Similarity 87.1%; 
Matches 4 54; Conservative 



Score 342.6; DB 6; 
Pred. No. 8.2e-99; 
0; Mismatches 14; 



Length 666; 
Indels 53; 



Gaps 



5; 



Qy 

Db 



67 



GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 
I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I II I I I I I I I I I I M I I I I II I I 
GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 126 



Qy 



Db 



61 AGGGGCT TT T AT TT CAACAAGC CCACAGGGTAT GGCT C CAGC AGT C GGAGGGC GC CT CAG 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 

127 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 186 



Qy 121 ACAGGCATCGTGGATGAGTGCTGCTTCCGGAGCTGTGATCTAAGGAGGCTGGAGATGTAT 180 



Db 



187 



246 



Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 24 0 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 247 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 306 

Qy 241 AT GC C CAAGAC CCAGAAGT AT C AGC CC CC AT CT AC CAACAAGAAC AC GAAGT CT CAGAGA 30 0 

I I I I I I I I I I I I II I 

Db 307 ATGCCCAAGACCCAG 321 

Qy 301 AGGAAAGGAAGT ACAT T T GAAGAACACAAGTAGAGGGAGT GC AGGAAACAAGAACT ACAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 322 AAGGAAGT AC AT TT GAAGAAC GCAAGT AGAGGGAGT GCAGGAAACAAGAACTAC AG 377 

Qy 361 GAT GT A- GAAGAC C CT T CT GAGGAGT GAAGAAG GACAGGCCACC GC AGGAC C CTT TGCT C 419 

MINI I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 378 GAT GT AGGAAGAC C CT C CT GAGGAGT GAAGAGT GAC AT GCCAC C GC AGGAT C CT TT GCT C 437 

Qy 420 T GC AC - AGTT ACCT G- TAAAC AT T GGAAT AC C GGCCAAAAAATAAGT TT GAT CACATT T C 477 

I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I 
Db 438 T GCAC GAGTT AC C T GT TAAACT TT GGAAC AC CT AC CAAAAAATAAGTTT GATAACATT T A 497 

Qy 478 AAAGAT - G G CAT T T C C C C C AAT GAAAT AC AC AAGT AAAC AT 517 

I I I I I I III I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 498 AAAGAT GGGCGTTTCCCC CAAT GAAAT AC ACAAGTAAACAT 538 



RESULT 13 

HSIGFI 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
PUBMED 
COMMENT 
FEATURES 

source 



gene 



HSIGFI 725 bp mRNA linear PRI ll-DEC-1998 

Homo sapiens mRNA for insulin-like growth factor 1A precursor, 
complete CDS . 
X00173 

X00173.1 GI:33015 

growth factor; insulin super family; insulin-like growth factor I; 
signal peptide; somatomedin. 
Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 

Jansen,M., van Schaik,F.M., Ricker,A.T., Bullock, B., Woods, D.E., 
Gabbay,K.H., Nussbaum, A. L . , Sussenbach, J. S . and Van den Brande,J.L. 
Sequence of cDNA encoding human insulin-like growth factor I 
precursor 

Nature 306 (5943), 609-611 (1983) 

84068210 

6358902 

Data kindly reviewed (28-MAY-1984 ) by M. Jansen. 
Location/Qualifiers 
1. .725 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
1. .725 
/gene="IGF-l" 



CDS 



polyA_jsite 



BASE COUNT 
ORIGIN 



12. .473 
/gene="IGF-l" 
/ codon_start=l 

/product="insulin-like growth factor 1A precursor" 
/protein_id="CAA24998 .1" 
/db_xref="GI: 33016" 
/db_xref="SWISS-PROT:P01343" 

/ translation="MGKISSLPTQLFKCCFCDFLKVKMHTMSSSHLFYLALCLLTFTS 

SATAGPETLCGAELVDALQFVCGDRGFYFNKPTGYGSSSRRAPQTGIVDECCFRSCDL 

RRLEMYCAPLKPAKSARSVRAQRHTDMPKTQKEVHLKNASRGSAGNKNYRM" 

725 

/gene="IGF-l" 
190 a 174 c 183 g 178 t 



Query Match 66.3%; Score 342.6; DB 9; Length 725; 

Best Local Similarity 87.1%; Pred. No. 8.3e-99; 

Matches 454; Conservative 0; Mismatches 14; Indels 53; 



Gaps 



5; 



Qy 



Db 



156 



GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 215 



QY 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
216 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 275 

121 ACAGGCATCGT GGAT GAGT GCT GCTT CCGGAGCTGTGATCTAAGGAGGCTGGAGATGTAT 180 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
276 ACAGGTATCGT GGAT GAGT GCT GCTT CCGGAGCTGTGATCTAAGGAGGCTGGAGATGTAT 335 

181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
336 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 395 

241 AT GCCCAAGACCCAGAAGTAT CAGCCCCCATCTACCAACAAGAACACGAAGTCT CAGAGA 300 
I I I I I I I I I I I I I I I 

396 ATGCCCAAGACCCAG 410 

301 AGGAAAGGAAGTACATTTGAAGAACACAAGTAGAGGGAGTGCAGGAAACAAGAACTACAG 360 

I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
411 AAGGAAGTACATTTGAAGAACGCAAGTAGAGGGAGTGCAGGAAACAAGAACTACAG 466 

361 GAT GT A- GAAGAC CCT T CT GAGGAGT GAAGAAGGACAGGCC ACCGC AGGAC C CT T T GCT C 419 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
4 67 GATGTAGGAAGACCCT CCT GAGGAGTGAAGAGTGACATGCCACCGCAGGAT CCTTT GCT C 526 

42 0 TGCAC- AGTTAC CTG- TAAACATTGGAATACCGGCCAAAAAATAAGTTT GATCACATTT C 477 

I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I 
527 T GC ACGAGT T AC CT GT TAAACT TT GGAACACCT ACCAAAAAAT AAGT T T GATAAC AT T T A 586 

47 8 AAAGAT-GGCATTTCCCCCAATGAAATACACAAGTAAACAT 517 

I I I I I I III I I I I 1 I I 1 I I I I I I I I I I I I I I I I I I I I I I 
587 AAAGAT GGGC GT T T C CC C CAAT GAAATACACAAGT AAAC AT 627 



RESULT 14 



HUMGFII 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
COMMENT 
FEATURES 

source 



gene 



CDS 



linear PRI 
complete cds . 



Chordata ; 
Primates ; 



Craniata; Vertebrata; Euteleostomi ; 
Catarrhini; Hominidae; Homo. 



RNA: isolation 



sig_peptide 



mat_peptide 



BASE COUNT 
ORIGIN 



HUMGFII 728 bp mRNA linear PRI 08-NOV-1994 

Human insulin-like growth factor I mRNA, 
M29644 

M29644.1 GI:183119 
insulin-like growth factor. 
Homo sapiens (human) 
Homo sapiens 
Eukaryota; Metazoa; 
Mammalia; Eutheria; 
1 (bases 1 to 728) 

Rall,L.B., Scott, J. and Bell, G.I. 

Human insulin-like growth factor I and II messenger 
of complementary DNA and analysis of expression 
Meth. Enzymol. 146, 239-248 (1987) 
88065102 
3683205 

Original source text: Human (adult) liver, cDNA to mRNA. 
Location/Qualifiers 
1. .728 

/organism="Homo sapiens" 

/mol_type="mRNA" 

/db_xref= ,, taxon:9606" 

/map="12q23" 

1. .728 

/gene="IGFl" 

81. .473 

/gene="IGFl" 

/note="insulin-like growth factor I precursor" 

/codon_start=l 

/protein_id="AAA52543. 1" 

/db_xref="GI : 183120" 

/db_xref="GDB:G0O-120-081" 

/ translation="MHTMSSSHLFYLALCLLTFTSSATAGPETLCGAELVDALQFVCG 
DRGFYFNKPTGYGSSSRRAPQTGIVDECCFRSCDLRRLEMYCAPLKPAKSARSVRAQR 
HTDMPKTQKEVHLKNASRGSAGNKNYRM" 
81. .155 
/gene="IGFl" 

/note="insulin-like growth factor I signal peptide" 
156. .365 
/gene="IGFl" 

/product="insulin-like growth factor I" 
193 a 174 c 183 g 178 t 
Chromosome 12q23. 



Query Match 66.3%; 
Best Local Similarity 87.1%; 
Matches 454; Conservative 



Score 342. 6; DB 9; 
Pred. No. 8.3e-99; 
0; Mismatches 14; 



Length 728; 
Indels 53; 



Gaps 



5; 



Qy 



Db 



1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
156 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 215 



Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I | 
Db 216 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 275 



Qy 121 ACAGGCAT CGT GGAT GAGT G CT G CT T C CGGAGCT GT GAT CTAAGGAGGCT GGAGAT GT AT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 276 ACAGGTAT CGT GGAT GAGT GCT GCTT CCGGAGCT GT GAT CTAAGGAGGCT GGAGAT GTAT 335 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 

Db 336 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 395 

Qy 241 ATGCCCAAGACCCAGAAGTAT CAGCC CCCAT CTACCAACAAGAACACGAAGT CTCAGAGA 300 

I I I I I I I I I I I I I I I 

Db 396 ATGCCCAAGACCCAG 410 

Qy 301 AGGAAAGGAAGT ACATT T GAAGAACAC AAGTAGAGGGAGT GCAGGAAACAAGAACT ACAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 411 AAGGAAGTACATT T GAAGAAC G CAAGTAGAGGGAGT GCAGGAAACAAGAACT AC AG 4 66 

Qy 361 GAT GT A- GAAGACCCTT CTGAGGAGT GAAGAAG GAC AGGC C AC C GC AGGAC C CTTT GCT C 419 

, I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 467 GAT GT AGGAAGACCCTC CT GAGGAGT GAAGAGT GACAT GCC AC C G CAGGAT CCTTTGCTC 526 

Qy 420 T GC AC - AGTTACCT G-T AAACAT T GGAAT AC C GGC CAAAAAATAAGT TT GAT CACAT TT C 477 

I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I 
Db 527 T GCAC GAGT TACCTGTTAAACTT TGGAAC AC CT AC CAAAAAAT AAGTT T GATAACAT T T A 586 

Qy 478 AAAG AT - G G CAT T T C C C C C AAT GAAAT AC AC AAGT AAACAT 517 

I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 587 AAAGAT GGGCGTTT CCCCCAAT GAAAT ACACAAGT AAACAT 627 



RESULT 15 

HUMIGFI 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
COMMENT 

FEATURES 

source 



gene 



HUMIGFI 1076 bp mRNA linear PRI 08-NOV-1994 

Human insulin-like growth factor mRNA, complete cds . 

M27544 

M27544. 1 GI:184829 
insulin-like growth factor. 
Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 1076) 

Le Bouc,Y., Dreyer,D., Jaeger, F., Binoux,M. and Sondermeyer, P . 

Complete characterization of the human IGF-I nucleotide sequence 

isolated from a newly constructed adult liver cDNA library 

FEBS Lett. 196 (1), 108-112 (1986) 

86108910 

2935423 

Original source text: Human liver, cDNA to mRNA, clones 
lanbda-TG[03, 04, 05] . 

Location/ Qualifiers 

1. .1076 

/organism="Homo sapiens" 

/mol_type="mRNA n 

/db_xref="taxon: 9606" 

/map="7pl3-pl2" 

1. .1076 

/gene="IGFBPl" 



mRNA <i. «>1076 

/gene="IGFBPl" 

/note="IGF mRNA (alt.)" 
mRNA <1. .989 

/gene="IGFBPl" 

/note=" IGF mRNA (alt.)" 
CDS 149. .610 

/gene="IGFBPl" 

/note="insulin-like growth factor precursor" 
/ codon_start=l 
/protein_id="AAA52787. 1" 
/db_xref="GI: 306927" 
/db_xref="GDB:G00-120-075" 

/ trans lation="MGKISSLPTQLFKCCFCDFLKVKMHTMSSSHLFYLALCLLTFTS 
SATAGPETLCGAELVDALQFVCGDRGFYFNKPTGYGSSSRRAPQTGIVDECCFRSCDL 
RRLEMYCAPLKPAKSARSVRAQRHTDMPKTQKEVHLKNASRGSAGNKNYRM" 
sig_peptide 149. .292 

/gene="IGFBPl" 

/note="insulin-like growth factor signal peptide" 
mat_peptide 293. .502 

/gene="IGFBPl" 

/product="insulin-like growth factor" 
BASE COUNT 283 a 251 c 239 g 303 t 

ORIGIN Chromosome 7pl3-pl2 . 



Query Match 66.3%; 
Best Local Similarity 87.1%; 
Matches 454; Conservative 



Score 342.6; DB 9; Length 1076; 
Pred. No. 8.8e-99; 
0; Mismatches 14; Indels 53; 



Gaps 



5; 



Qy 



Db 



293 



GGAC C GGAGACGCTCT GCGGGGCT GAGCT GGT GGAT GCT CT T C AGT T C GT GT GT GGAGAC 60 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 352 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
353 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 412 

121 ACAGGCATCGTGGATGAGTGCTGCTTCCGGAGCTGTGATCTAAGGAGGCTGGAGATGTAT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I 
413 ACAGGCATCGTGGATGAGTGCTGCTTCCGGAGCTGTGATCTAAGGAGGCTGGAGATGTAT 472 

181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
473 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 532 

241 AT G C C C AAG AC C C AGAAGT AT C AG C C C C CAT C T AC C AAC AAGAAC AC GAAGT CT C AG AG A 300 
I I I I I I I I I I I I I I I 

533 AT GC C CAAGAC CCAG 547 

301 AGGAAAGGAAGT AC AT T T GAAGAACACAAGTAGAG GGAGT GCAGGAAACAAGAACT AC AG 360 

I I II I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
54 8 AAGGAAGTACATTTGAAGAACGCAAGTAGAGGGAGTGCAGGAAACAAGAACTACAG 603 



Qy 



Db 



361 GAT GT A- GAAGAC C CT T CT GAGGAGT GAAGAAGGACAGG C C ACCGC AG GAC C CT TT GCT C 419 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
604 GAT GT AGGAAGAC C CT C CT GAGGAGT GAAGAGTGACAT GC CAC C GCAG GAT CCTTTGCTC 663 



Qy 420 T G CAC - AGTT AC CT G- T AAAC AT T GGAATAC C GGC CAAAAAAT AAGT TT GAT C ACATTTC 477 

I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I II I I I I I I I I 

Db 664 T GCACGAGTTAC CT GTTAAACTTT GGAACACCTACCAAAAAATAAGTTT GATAACATTTA 723 

Qy 47 8 AAAG AT - G G CAT T T C C C C C AAT GAAAT AC ACAAGT AAAC AT 517 

I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 724 AAAGAT GGGCGTTTCCCC C AAT GAAAT AC AC AAGT AAAC T T 764 



Search completed: December 13, 2003, 09:27:32 
Job time : 2313.97 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2003 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 



Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



December 13, 2003, 02:35:18 ; Search time 207.586 Seconds 

(without alignments ) 
6723.048 Million cell updates/sec 

US-09-852-261-1 
517 

1 ggaccggagacgctctgcgg tgaaatacacaagtaaacat 517 

IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 



Searched: 2552756 seqs, 1349719017 residues 

Total number of hits satisfying chosen parameters: 



5105512 



Minimum DB seq length: 
Maximum DB seq length: 



2000000000 



Post-processing : 



Database 



Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 



2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 



N_Geneseq_19 Jun03 : 
* /SIDSl/gcgdata/g 



/ ^xu^x/ gcgaaiza/ genes eq/ genes eqn-emDi/NAiy 
/ SIDSl/gcgdata/ geneseq/geneseqn-embl/NA19 
/ SIDSl/gcgdata/geneseq/geneseqn-embl/NA19 
/ SIDS1/ gcgdata/ geneseq/geneseqn-embl/NA19: 
/ SIDSl/gcgdata/geneseq/geneseqn-embl/NAl9 
/ SIDS1/ gcgdata/geneseq/geneseqn-embl/NA19 
/ SIDSl/gcgdata/ geneseq/ geneseqn-embl/NA19 
/SIDSl/ gcgdata/ geneseq/geneseqn-embl/NA19 
/ SIDSl/ gcgdata/ geneseq/ geneseqn-erabl/NA19 
/ SIDSl/ gcgdata/geneseq/geneseqn-embl/NAl 
/SIDSl/gcgdata/geneseq/geneseqn-embl/NAl 
/SIDSl/ gcgdata/geneseq/geneseqn-embl/NAl 



SIDSl/gcgdata/ geneseq/gene, 
'SIDSl/gcgdata/geneseq/gene 
/SIDSl/gcgdata/geneseq/geneseqn-embl/NAl 
/SIDSl/gcgdata/geneseq/geneseqn-embl/NAl 
/ SIDSl/gcgdata/ geneseq/geneseqn-embl/NAl 
/SIDSl/gcgdata/geneseq/geneseqn-embl/NAl 
/ SIDSl/ gcgdata/ geneseq/ geneseqn-embl/NAl 



^>1/ gcgdata/ geneseq/ geneseqn-embl/NAl 

, 31/ gcgdata/ geneseq/ geneseqn-embl/NAl 

/ SIDSl/ gcgdata/ geneseq/geneseqn-embl/NAl 
/ SIDSl/ gcgdata/ geneseq/geneseqn-embl/NAl 
/SIDSl/gcgdata/geneseq/geneseqn-embl/NAl: 
/SIDSl/gcgdata/geneseq/geneseqn-embl/NA2 
/SIDSl/gcgdata/geneseq/geneseqn-embl/NA2 
/SIDSl/gcgdata/geneseq/geneseqn-embl/NA2 
/SIDSl/gcgdata/geneseq/geneseqn-embl/NA2 
/SIDSl/ gcgdata/ geneseq/geneseqn-embl/NA2 



80. DAT:* 
1 . DAT : * 

82. DAT:* 

83. DAT:* 

84. DAT:* 

85. DAT:* 

86. DAT:* 

87. DAT:* 

88. DAT:* 

98 9. DAT: 

990. DAT: 

991. DAT: 

992. DAT: 

993. DAT: 

994 . DAT: 

995. DAT: 

996. DAT: 

997 . DAT: 

998 . DAT: 

999. DAT: 
000. DAT: 
001A.DAT 
001B.DAT 

002 . DAT: 

003. DAT: 



Pred. No. is the number of results predicted by chance to have a 



score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query 



to . 


Score 


Match 


Length 


DB 


ID 


Description 


1 


517 


100 


.0 


517 


22 


AAD06398 


Human IGF-I isofor 


2 


517 


100 


.0 


517 


24 


AAS16877 


Human mechano-grow 


3 


A C~l 

4b/. 


4 


90 


.4 


523 


22 


AAD06400 


Rabbit IGF-I isofo 


4 


4b/. 


A 

4 


90 


.4 


523 


24 


AAS 1687 9 


Rabbit mechano-gro 


5 


a an 
4b/. 


A 

4 


90 


.4 


553 


18 


AAT 84 893 


Rabbit insulin lik 


6 


3 / / . 


Z 


73 


.0 


471 


22 


AAD06405 


Rabbit liver-type 


7 


3 / / . 


Z 


73 


.0 


471 


24 


AAS16884 


Rabbit insulin-lik 


o 
0 


o4 4 . 


o 
z 


66 


. 6 


818 


8 


AAN70436 


Sequence encoding 


9 


34 4 . 


<") 

z 


66 


.6 


7260 


24 


ABT11091 


Human breast cance 


10 


*D A A 

34 4 . 


Z 


66 


.6 


7260 


24 


ABK84583 


Human cDNA differe 


11 


344 . 


2 


66 


. 6 


7260 


24 


ABN97244 


Gene #3742 used to 


12 


1 A A 

344 . 


Z 


66 


.6 


7260 


24 


ABK64812 


Human benign prost 


13 


O A A 

j4 4 . 


Z 


66 


.6 


7260 


24 


ABK35504 


Human endometrial 


14 


344 . 


2 


66 


.6 


7260 


24 


ABK35561 


Gene IGF1 differen 


15 


342 . 


6 


66 


.3 


111 


18 


AAT84894 


Human insulin like 


16 


339 . 


4 


65 


.6 


622 


7 


AAN60490 


Human prepro-somat 


17 


325 . 


2 


62 


. 9 


539 


22 


AAD06399 


Rat IGF-I isoform 


18 


o o c 

3Z O . 


Z 


62 


.9 


539 


24 


AAS16878 


Rat mechano-growth 


19 


318 . 


2 


61 


.5 


651 


25 


ABV7 6185 


Mouse insulin-like 


20 


o n o 
3Uo . 


b 


59 


.7 


1136 


8 


AAN70435 


Sequence encoding 


21 


o o c 
zo b . 


4 


55 


.4 


3599 


19 


AAV5042 8 


Plasmid pIG0552 lo 


22 


28 6. 


4 


55 


.4 


3599 


19 


AAV40796 


Actual sequence of 


23 


O O *C 

z o b . 


A 

4 


55 


.4 


3600 


19 


AAV50427 


Plasmid pIG0552 up 


24 


o o c 
z o b . 


4 


55 


.4 


3600 


19 


AAV40795 


Expected sequence 


25 


O Q ^ 

z o b . 


4 


55 


.4 


5707 


20 


AAX88055 


Plasmid pIG0335 DN 


26 


286. 


4 


55 


.4 


6345 


20 


AAX88054 


Plasmid pIGOlOOA D 


27 


285. 


4 


55 


.2 


612 


22 


AAS14695 


Human cDNA encodin 


28 


285. 


4 


55 


.2 


612 


25 


ABZ83309 


Toxicologically re 


29 


271. 


2 


52 


.5 


978 


14 


AAQ47 8 04 


Sequence encoding 


30 


258. 


4 


50 


.0 


317 


O A 

24 


AAS 168 82 


Human insulin-like 


31 


258. 


4 


50 


. 0 


318 


22 


AAD06403 


Human liver-type I 


32 


258. 


4 


50 


. 0 


462 


19 


AAV50426 


Human IGF-1 encodi 




258. 


4 


50 


.0 


4 uz 


1 Q 

iy 


AAV4U / y 4 


riuman ibr-i coding 


34 


258. 


4 


50 


. 0 


462 


24 


ABZ35734 


Human IGF1 polynuc 


35 


258. 


4 


50 


.0 


462 


24 


ABX09977 


Human IGF1 DNA fra 


36 


258. 


4 


50 


.0 


462 


24 


ABV78158 


Human IGF1 DNA SEQ 


37 


258. 


4 


50 


.0 


462 


24 


ABL91699 


Human polynucleoti 


38 


252. 


6 


48 


.9 


1052 


20 


AAX27498 


Rat liver form of 


39 


247. 


8 


47 


.9 


487 


22 


AAD06404 


Rat liver-type IGF 


40 


247. 


8 


47 


.9 


487 


24 


AAS16883 


Rat insulin-like g 


41 


234. 


2 


45 


.3 


671 


24 


ABT09479 


Phase-1 Rat CT gen 


42 


210 


40 


.6 


210 


24 


AAD45568 


Human insulin-like 


43 


210 


40 


.6 


210 


24 


AAD44955 


Human insulin grow 


44 


210 


40 


.6 


210 


24 


ABA03146 


Native mature IGF- 


45 


208. 


4 


40 


.3 


237 


12 


AAQ13568 


Beta-gal/IGF-1 fus 



ALIGNMENTS 



RESULT 1 
AAD06398 

ID AAD06398 standard; cDNA; 517 BP. 
XX 

AC AAD06398; 
XX 

DT 10-AUG-2001 (first entry) 
XX 

DE Human IGF-I isoform mechano-growth factor (MGF) cDNA. 
XX 

KW Human; IGF-I isoform; Insulin-like Growth Factor-I; MGF; 

KW mechano-growth factor; neurological disorder; neurodegenerative disorder; 

KW amyotrophic lateral sclerosis; spinal muscular atrophy; muscular atrophy; 

KW poliomyelitis; post-polio syndrome; toxin; motoneurone disorder; 

KW nerve damage; autosomal muscular dystrophy; diabetic neuropathy; 

KW sex-linked muscular dystrophy; peripheral neuropathy; 

KW Alzheimer's disease; Parkinson's disease; ss. 

XX 

OS Homo sapiens. 



XX 

FH Key Location/Qualifiers 

FT CDS . 1. .333 

FT /*tag= a 

FT /product= "Mechano-growth factor (MGF) " 

FT /note= "This region comprises exons 3-6. The CDS does 

FT not include start codon" 

FT /partial 

XX 



PN WO200136483-A1. 
XX 

PD 25-MAY-2001. 
XX 

PF 15-NOV-2000; 2000WO-GB04354 . 
XX 

PR 15-NOV-1999; 99GB-0026968 . 
XX 

PA (UNLO ) UNIV COLLEGE LONDON. 
XX 

PI Goldspink G, Johnson I; 
XX 

DR WPI; 2001-355620/37. 

DR P-PSDB; AAE02447. 
XX 

PT Use of mechano-growth factor, an isoform of Insulin-like Growth 

PT Factor-I, capable of reducing motoneurone loss, in the manufacture of a 

PT medicament for the treatment of neurological disorder - 

XX 

PS Claim 4; Page 49-50; 66pp; English. 
XX 

CC The present invention relates to use of mechano-growth factor (MGF) , 

CC an Insulin-like Growth Factor-I (IGF-I) isoform in the manufacture of a 

CC medicament for the treatment of neurological disorder. The MGF is capable 

CC of reducing motoneurone loss by 20% or greater in response to nerve 

CC avulsion, and effects motoneurone rescue, preferably adult motoneurone 

CC rescue. The MGF polynucleotide and polypeptide are useful in the 

CC manufacture of a medicament for the treatment of a neurological disorder, 



CC including a disorder of motoneurones and/or neurodegenerative disorder, 

CC e.g., amyotrophic lateral sclerosis, spinal muscular atrophy, progressive 

CC spinal muscular atrophy, infantile or juvenile muscular atrophy, 

CC poliomyelitis or post-polio syndrome, a disorder caused by exposure to a 

CC toxin, motoneurone trauma, a motoneurone lesion or nerve damage, an 

CC injury that affects motoneurones, motoneurone loss associated with aging, 

CC autosomal or sex-linked muscular dystrophy, diabetic neuropathy, 

CC peripheral neuropathies, Alzheimer's disease and Parkinson's disease. 

CC The present sequence is human IGF-I isoform MGF cDNA. MGF is a muscle 

CC isoform having extracellular (Ec) domain, hence also referred as 

CC IGF-I-Ec. The MGF protein comprises amino acid sequences encoded by 

CC nucleic acid sequence of IGF-I exons 4, 5 and 6 in the reading frame 

CC of MGF. 

XX 

SQ Sequence 517 BP; 150 A; 130 C; 139 G; 98 T; 0 other; 

Query Match 100.0%; Score 517; DB 22; Length 517; 

Best Local Similarity 100.0%; Pred. No. 1.6e-146; 

Matches 517; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

Qy 121 ACAGGCAT C GT GGAT GAGT G CT GCTT C C GGAGCT GT GAT CTAAGGAGGCT GGAGAT GT AT 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 121 ACAGGCATCGTGGATGAGTGCTGCTTCCGGAGCTGTGATCTAAGGAGGCTGGAGATGTAT 18 0 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 24 0 

Qy 241 ATGCCCAAGACCCAGAAGTATCAGCCCCCAT CTACCAACAAGAACACGAAGT CTCAGAGA 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M II I I I I I I I I II I I I I I I I I I I I I I I I II 
Db 241 AT GC C CAAGACC CAGAAGT AT CAGCCC C CAT CTACCAACAAGAAC AC GAAGT CT CAGAGA 300 

Qy 301 AGGAAAGGAAGTACAT T T GAAGAACACAAGT AGAGGGAGT GCAGGAAACAAGAACT AC AG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 

Db 301 AGGAAAGGAAGTACATTT GAAGAACACAAGT AGAGGGAGT GCAGGAAACAAGAACT ACAG 360 

Qy 361 GAT GT AGAAGACCCT T CT GAGGAGTGAAGAAGGAC AGGC CAC CGCAGGAC C CTT T GCT CT 420 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 361 GAT GT AGAAGACCCTT CTGAGGAGTGAAGAAGGACAGGCCACCGCAGGACCCTTT GCT CT 420 

Qy 421 GCAC AGT T AC CT GT AAAC AT T GGAAT AC C GGC CAAAAAATAAGT T T GAT C ACATT T CAAA 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 GCAC AGT T AC CT GT AAAC ATT GGAAT AC C G G C CAAAAAATAAGTTT GAT CACATTT CAAA 480 

Qy 481 GATGGCATTT CCCCCAAT GAAATACACAAGTAAACAT 517 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 481 GATGGCATTTCCCCCAAT GAAATACACAAGTAAACAT 517 



RESULT 2 
AAS16877 

ID AAS16877 standard; cDNA; 517 BP. 
XX 

AC AAS16877; 
XX 

DT 25-FEB-2002 (first entry) 
XX 

DE Human mechano-growth factor (MGF) cDNA. 
XX 

KW Human; mechano-growth factor; insulin-like growth factor I; IGF-I; MGF; 

KW neuroprotective; nerve damage; peripheral nervous system; nerve severing; 

KW muscle; neurological disorder; motoneuron loss; motorneuron disorder; ss; 

KW nerve avulsion. 
XX 

OS Homo sapiens. 



XX 

FH Key Location/Qualifiers 

FT CDS 1..333 

FT /*tag= a 

FT /product= "Human MGF" 

FT /partial 

FT /note= "No start codon" 

FT exon 1. .76 

FT /*tag= b 

FT /number= 3 

FT exon 77.. 259 

FT /*tag= c 

FT /number= 4 

FT exon 260.. 307 

FT /*tag= d 

FT /number= 5 

FT exon 308.. 330 

FT /*tag= e 

FT /number= 6 

XX 



PN WO200185781-A2. 
XX 

PD 15-NOV-2001. 
XX 

PF 10-MAY-2001; 2001WO-GB02054 . 
XX 

PR 10-MAY-2000; 2000GB-0011278 . 
XX 

PA (UNLO ) UNIV COLLEGE LONDON . 

PA (EGRI-) EAST GRINSTEAD MEDICAL RES TRUST. 

XX 

PI Goldspink G, Terenghi G; 
XX 

DR WPI; 2002-055585/07. 

DR P-PSDB; AAU10559. 
XX 

PT Use of insulin-like growth factor I (IGF-I) isoform known as 

PT mechano-growth factor which is encoded by IGF-I exons 4,5,6 and has 

PT ability to reduce motoneuron loss in response to nerve avulsion, to 

PT treat nerve damage 

XX 



PS Claim 11; Fig 5; 65pp; English. 
XX 

CC The invention relates to the use of an insulin-like growth factor I 

CC (IGF-I) isoform, known as mechano-growth factor (MGF) , in the manufacture 

CC of a medicament for treating nerve damage in the peripheral nervous 

CC system, or for treating nerve damage by localising MGF at the site of 

CC damage. The nerve damage may include severing of a nerve. The treatment 

CC may be combined with another treatment (such as a polypeptide growth 

CC factor other than MGF) that prevents or diminishes degeneration of the 

CC target organ (for example, muscle) which the damaged nerve innervates, 

CC whereby the treatment of the muscle with MGF or a polynucleotide encoding 

CC MGF prevents or diminishes degeneration. The method is useful for 

CC treating neurological disorders, preferably motorneuron disorders. These 

CC methods can reduce motoneuron loss by 20% or greater in response to nerve 

CC avulsion. This sequence represents cDNA encoding the human MGF. 

XX 

SQ Sequence 517 BP; 150 A; 130 C; 139 G; 98 T; 0 other; 



Query Match 100.0%; Score 517; DB 24; Length 517; 

Best Local Similarity 100.0%; Pred. No. 1.6e-146; 

Matches 517; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I 1 I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

Qy 121 ACAGGCAT CGT GGAT GAGT GCT GCTTCCGGAGCT GT GAT CTAAGGAGGCT GGAGAT GTAT 180 

I II II I I I I I I I I II I I I I I I I I I II I I I I I M I II I I I II I I I I I I I I I I I I I I II I I I 
Db 121 ACAGGCAT C GT GGAT GAGT G CT GCTTC CGGAGCT GT GAT CTAAGGAGGCT GGAGAT GTAT 180 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

Qy 241 AT GC C CAAGAC C CAGAAGT AT CAGC C C CC AT CT AC C AAC AAGAAC ACGAAGT CT C AGAGA 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 241 AT GCC CAAGAC C CAGAAGT AT CAGCCCCCATCTACCAACAAGAACACGAAGTCT CAGAGA 300 

Qy 301 AGGAAAGGAAGTACATTTGAAGAACACAAGTAGAGGGAGTGCAGGAAACAAGAACTACAG 360 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 AGGAAAGGAAGT ACAT T T GAAGAACACAAGT AGAGGGAGT GC AGGAAACAAGAACT AC AG 360 

Qy 361 GAT GT AGAAGACC CT T CT GAGGAGTGAAGAAGGAC AGGC C AC C G CAGGACC CT T T GCT CT 420 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I J I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 361 GAT GT AGAAGACC CT T CT GAGGAGT GAAGAAGGAC AGGCC AC C G CAG GAC CCTTTGCTCT 420 

Qy 421 GC AC AGTTACCT GT AAAC AT T GGAAT ACCGGC CAAAAAATAAGT TT GAT C ACAT T T CAAA 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II M I I I I I I I I I I I I I I I I 
Db 421 GC ACAGTTAC CT GTAAACATT GGAAT AC CGGCCAAAAAAT AAGT TT GAT C AC AT T T CAAA 480 

Qy 481 GAT G G CAT T T C C C C C AAT GAAATACAC AAGT AAAC AT 517 

I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 481 GAT G GC AT T T C C C C C AAT GAAATACAC AAGT AAAC AT 517 



RESULT 3 
AAD06400 

ID AAD06400 standard; cDNA; 523 BP. 
XX 

AC AAD06400; 
XX 

DT 10-AUG-2001 (first entry) 
XX 

DE Rabbit IGF- I isoform mechano-growth factor (MGF) cDNA. 
XX 

KW Rabbit; IGF-I isoform; Insulin-like Growth Factor-I; MGF; 

KW mechano-growth factor; neurological disorder; neurodegenerative disorder; 

KW amyotrophic lateral sclerosis; spinal muscular atrophy; muscular atrophy; 

KW poliomyelitis; post-polio syndrome; toxin; motoneurone disorder; 

KW nerve damage; autosomal muscular dystrophy; diabetic neuropathy; 

KW sex-linked muscular dystrophy; peripheral neuropathy; 

KW Alzheimer's disease; Parkinson's disease; ss. 

XX 

OS Oryctolagus cuniculus . 



XX 

FH Key Location/Qualifiers 

FT CDS 1..336 
FT /*tag= a 

FT /product= "Mechano-growth factor (MGF) " 

FT /note= "This region comprises exons 3-6. The CDS does 

FT not include start codon" 

FT /partial 

XX 



PN WO200136483-A1. 
XX 

PD 25-MAY-2001. 
XX 

PF 15-NOV-2000; 2000WO-GB04354 . 
XX 

PR 15-NOV-1999; 99GB-002 6968 . 
XX 

PA (UNLO ) UNIV COLLEGE LONDON. 
XX 

PI Goldspink G, Johnson I; 
XX 

DR WPI; 2001-355620/37. 

DR P-PSDB; AAE02449. 
XX 

PT Use of mechano-growth factor, an isoform of Insulin-like Growth 

PT Factor-I, capable of reducing motoneurone loss, in the manufacture of a 

PT medicament for the treatment of neurological disorder - 

XX 

PS Claim 4; Page 53-54; 66pp; English. 
XX 

CC The present invention relates to use of mechano-growth factor (MGF) , 

CC an Insulin-like Growth Factor-I (IGF-I) isoform in the manufacture of a 

CC medicament for the treatment of neurological disorder. The MGF is capable 

CC of reducing motoneurone loss by 20% or greater in response to nerve 

CC avulsion, and effects motoneurone rescue, preferably adult motoneurone 

CC rescue. The MGF polynucleotide and polypeptide are useful in the 



CC manufacture of a medicament for the treatment of a neurological disorder, 

CC including a disorder of motoneurones and/or neurodegenerative disorder, 

CC e.g., amyotrophic lateral sclerosis, spinal muscular atrophy, progressive 

CC spinal muscular atrophy, infantile or juvenile muscular atrophy, 

CC poliomyelitis or post-polio syndrome, a disorder caused by exposure to a 

CC toxin, motoneurone trauma, a motoneurone lesion or nerve damage, an 

CC injury that affects motoneurones, motoneurone loss associated with aging, 

CC autosomal or sex-linked muscular dystrophy, diabetic neuropathy, 

CC peripheral neuropathies, Alzheimer's disease and Parkinson's disease. 

CC The present sequence is rabbit IGF- I isoform MGF cDNA. MGF is a muscle 

CC isoform having extracellular (Ec) domain, hence also referred as 

CC IGF-I-Ec. The MGF protein comprises amino acid sequences encoded by 

CC nucleic acid sequence of IGF-I exons 4, 5 and 6 in the reading frame 

CC of MGF. 

XX 

SQ Sequence 523 BP; 154 A; 129 C; 142 G; 98 T; 0 other; 



Query Match 90.4%; Score 467.4; DB 22; Length 523; 

Best Local Similarity 96.2%; Pred. No. 1.8e-131; 

Matches 501; Conservative 0; Mismatches 16; Indels 4; Gaps 2; 



Qy 


1 


GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 


60 






i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 
M 1 M M M 1 1 M 1 1 1 M M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 




JJD 


l 


btjACUbtjAGAUGC I CI GCGGl GC 1 GAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 


60 


Qy 


61 


AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 


120 






i i i i i i i t i i i i i i i t i i i t i i t i i i i i > ii i i i i i i i i i i i i i i i i i i i i t i i i i i 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 






D _L 


Avj(jIj(jU 111 1A1 1 1 L/lALAftbL. LLALAbbAl AL bbL 1 UUACjLAGI CGGAGGGLACL/1 CAG 


ion 


Qy 


121 


ACAGGC AT C GT GGAT GAGT GCT GCTT C C GGAGCT GTGATCT AAGGAGGCT GGAGAT GT AT 


180 






1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


121 


ACAGGCAT CGT GGAT GAGT GCT GCTTCCGGAGCTGTGATCTGAGGAGGCTGGAGAT GTAC 


180 


Qy 


181 


TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 


240 






II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II III 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


181 


TGTGCACCCCTCAAGCCGGCAAAGGCAGCCCGCTCCGTCCGTGCCCAGCGCCACACCGAC 


240 


Qy 


241 


AT GC C CAAGAC CCAGAAGTAT CAGCCCCCAT CT ACCAACAAGAACAC GAAGT CT C A G 


297 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 MINIMI I 




Db 


241 


AT GCC CAAGACT CAGAAGTAT CAGCCT C CAT CT ACCAACAAGaAAAT GAAGT CT C AGAGG 


300 


Qy 


298 


AGAAGGAAAGGAAGTACATTTGAAGAACACAAGTAGAGGGAGTGCAGGAAACAAGAACTA 


357 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


301 


AGAAGGAAAGGAAGTACATTTGAAGAACACAAGTAGAGGGAGTGCAGGAAACAAGAACTA 


360 


Qy 


358 


CAGGAT GT A- GAAGAC C CTT CT GAGGAGT GAAGAAGGACAGGCC AC CGC AGGAC C CTTT G 


416 






1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 




Db 


361 


CAGGAT GT AGGAAGAC C CTT CT GAGGAGT GAAGAAGGACAGGC CAC CGCAGGAC C CT TT G 


420 


Qy 


417 


CTCTGCACAGTTACCTGTAAACATTGGAATACCGGCCAAAAAATAAGTTTGATCACATTT 


476 






1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 




Db 


421 


CTCTGCACAGTTACCTGTAAACATTGGAATACCGGCCAaAAAATAAGTTTGATCACATTT 


480 


Qy 


477 


CAAAGATGGCATTTCCCCCAATGAAATACACAAGTAAACAT 517 








1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


481 


CAAAGATGGCATTTCCCCCAATGAAATACACAAGTAAACAT 521 





RESULT 4 
AAS16879 

ID AAS16879 standard; cDNA; 523 BP. 
XX 

AC AAS16879; 
XX 

DT 25-FEB-2002 (first entry) 
XX 

DE Rabbit mechano-growth factor (MGF) cDNA. 
XX 

KW Rabbit; mechano-growth factor; insulin-like growth factor I; IGF-I; MGF; 

KW neuroprotective; nerve damage; peripheral nervous system; nerve severing; 

KW muscle; neurological disorder; motoneuron loss; motorneuron disorder; ss; 

KW nerve avulsion. 
XX 

OS Oryctolagus cuniculus. 



XX 

FH Key Location/Qualifiers 

FT CDS 1..336 

FT /*tag= a 

FT /product= "Rabbit MGF" 

FT /partial 

FT /note= "No start codon" 

FT exon 1. .76 

FT /*tag= b 

FT /number^ 3 

FT exon 77.. 259 

FT /*tag= c 

FT / number = 4 

FT exon 260 . .309 

FT /*tag= d 

FT / number = 5 

FT exon 311.. 333 

FT /*tag= e 

FT /number- 6 

XX 



PN WO200185781-A2. 
XX 

PD 15-NOV-2001. 

XX 

PF 10-MAY-2001; 2001WO-GB02054 . 
XX 

PR 10-MAY-2000; 2 000GB-0011278 . 
XX 

PA (UNLO ) UNIV COLLEGE LONDON. 

PA (EGRI-) EAST GRINSTEAD MEDICAL RES TRUST. 

XX 

PI Goldspink G, Terenghi G; 
XX 

DR WPI; 2002-055585/07. 

DR P-PSDB; AAU10561. 
XX 

PT Use of insulin-like growth factor I (IGF-I) isoform known as 

PT mechano-growth factor which is encoded by IGF-I exons 4,5,6 and has 

PT ability to reduce motoneuron loss in response to nerve avulsion, to 

PT treat nerve damage 



PS Disclosure; Fig 7; 65pp; English. 
XX 

CC The invention relates to the use of an insulin-like growth factor I 

CC (IGF-I) isoform, known as mechano-growth factor (MGF) , in the manufacture 

CC of a medicament for treating nerve damage in the peripheral nervous 

CC system, or for treating nerve damage by localising MGF at the site of 

CC damage. The nerve damage may include severing of a nerve. The treatment 

CC may be combined with another treatment (such as a polypeptide growth 

CC factor other than MGF) that prevents or diminishes degeneration of the 

CC target organ (for example, muscle) which the damaged nerve innervates, 

CC whereby the treatment of the muscle with MGF or a polynucleotide encoding 

CC MGF prevents or diminishes degeneration. The method is useful for 

CC treating neurological disorders, preferably motorneuron disorders. These, 

CC methods can reduce motoneuron loss by 20% or greater in response to nerve 

CC avulsion. This sequence represents cDNA encoding the rabbit MGF. 

XX 

SQ Sequence 523 BP; 154 A; 129 C; 142 G; 98 T; 0 other; 



Query Match 90.4%; Score 467.4; DB 24; Length 523; 

Best Local Similarity 96.2%; Pred. No. 1.8e-131; 

Matches 501; Conservative 0; Mismatches 16; Indels 4; Gaps 2; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 1 GGACCGGAGACGCTCTGCGGTGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 61 AGGGGCTTTTATTTCAACAAGCCCACAGGATACGGCTCCAGCAGTCGGAGGGCACCTCAG 120 

Qy 121 ACAGGCAT CGT GGAT GAGTGCT GCTTCCGGAGCTGTGATCTAAGGAGGCT GGAGATGTAT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 ACAGGCAT CGTGGAT GAGT GCTGCTT CCGGAGCT GTGATCT GAGGAGGCTGGAGAT GTAC 180 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

II I I I I I I I I I I I I I I II III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 TGTGCACCCCTCAAGCCGGCAAAGGCAGCCCGCTCCGTCCGTGCCCAGCGCCACACCGAC 240 

Qy 241 AT G C C C AAGAC C C AG AAGT AT C AG C C C C CAT C T AC C AAC AAG AAC AC GAAGT C T C A G 2 97 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
Db 241 AT GC C CAAGACT CAGAAGT AT CAGCCT CCAT CT ACCAACAAGAAAAT GAAGT CT C AGAGG 300 

Qy 2 98 AGAAGGAAAGGAAGTACATTTGAAGAACACAAGTAGAGGGAGTGCAGGAAACAAGAACTA 357 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 AGAAGGAAAGGAAGTACATTTGAAGAACACAAGTAGAGGGAGTGCAGGAAACAAGAACTA 360 

Qy 358 CAGGAT GT A- GAAGAC C CTT CT GAGGAGTGAAGAAGGACAGGC CAC C GCAGGAC CCTT T G 416 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 CAG GAT GT AGGAAGAC CCTT CT GAGGAGT GAAGAAGGAC AGGC C AC C GCAGGAC C CTT T G 420 

Qy 417 CT C T GCACAGT T AC CT GTAAAC ATT GGAAT AC C GG C CAAAAAAT AAGT TT GAT C ACAT T T 476 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I II I I I I I I I I I I 
Db 421 CT CT GCACAGT T AC CT GTAAAC ATT GGAAT AC C GGC CAAAAAAT AAGTT T GAT C ACATT T 480 

Qy 477 C AAAGAT GG C AT T T CC C C C AAT GAAAT AC AC AAGT AAAC AT 517 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 



Db 



481 C AAAGAT G G CAT T T C C C C C AAT GAAAT AC AC AAGT AAAC AT 521 



RESULT 5 
AAT84893 

ID AAT84893 standard; cDNA; 553 BP. 
XX 

AC AAT84893; 
XX 

DT 14-APR-1998 (first entry) 
XX 

DE Rabbit insulin like growth factor 1 encoding cDNA. 
XX 

KW Insulin like growth factor 1; IGF-1; Ec peptide; muscle disorder; 

KW heart; neuromuscular disease; primer; ss. 

XX 

OS Oryctolagus cuni cuius. 
XX 

FH Key Location/Qualifiers 

FT CDS 1..366 

FT /*tag= a 

FT /product= "IGF-1" 

XX 

PN W09733997-A1. 
XX 

PD 18-SEP-1997. 
XX 

PF ll-MAR-1997; 97WO-GB00658 . 
XX 

PR ll-MAR-1996; 96GB-0005124 . 
XX 

PA (UNLO ) ROYAL FREE HOSPITAL SCHOOL MED. 
XX 

PI Goldspink G; 
XX 

DR WPI; 1997-470877/43. 

DR P-PSDB; AAW23301. 
XX 

PT Use of insulin like growth factor I characterised by presence of Ec 

PT peptide - to treat humans or animals , particularly muscle disorders, 

PT heart conditions or neuromuscular diseases 
XX 

PS Disclosure; Fig 3; 33pp; English. 
XX 

CC A use of insulin like growth factor I (IGF-1) has been developed, and 

CC is characterised by the presence of the Ec peptide, or a functional 

CC equivalent, in the treatment or therapy of a human or animal. The IGF-1 

CC polypeptide can be used to treat muscular disorders, e.g. Duchenne or 

CC Becker muscular dystrophy, autosomal dystrophies and related progressive 

CC skeletal muscle weakness and wasting, muscle atrophy in ageing humans, 

CC spinal cord injury induced muscle atrophy and neuromuscular diseases, 

CC and cardiac disorders, e.g. diseases where promotion of cardiac muscle 

CC protein synthesis is a beneficial treatment, cardiomyopathies and acute 

CC heart failure or insult, specifically myocarditis or myocardial 

CC infarction. It can also be used to promote bone fracture healing and 

CC maintenance of bone in old age. The present sequence encodes rabbit 

CC IGF-1 used in the present specification. 



XX 

SQ Sequence 553 BP; 159 A; 142 C; 147 G; 105 T; 0 other; 

Query Match 90.4%; Score 467.4; DB 18; Length 553; 

Best Local Similarity 96.2%; Pred. No. 1.8e-131; 

Matches 501; Conservative 0; Mismatches 16; Indels 4; Gaps 2; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I M I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I 
Db 31 GGACCGGAGACGCTCTGCGGTGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 90 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

M I I I I I I I I I II I I I I I I I I I I M I M I II I I I I I I I I I I II I I I I I II I II I II I 
Db 91 AGGGGCTTTTATTTCAACAAGCCCACAGGATACGGCTCCAGCAGTCGGAGGGCACCTCAG 150 

Qy 121 ACAGGCAT CGT GGAT GAGT GCTGCTT CCGGAGCT GTGAT CTAAGGAGGCTGGAGATGTAT 180 

I I I I M I I M I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I II I M I I I I I I 
Db 151 ACAGGCAT CGT GGAT GAGT GCT GCTT CCGGAGCT GT GAT CTGAGGAGGCTGGAGATGTAC 210 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

M I I I I M I I I M I M || Ml I I II II I I I I I I I I I I I I I I I I I I I || I || I I I 
Db 211 TGTGCACCCCTCAAGCCGGCAAAGGCAGCCCGCTCCGTCCGTGCCCAGCGCCACACCGAC 270 

Qy 241 AT GC C CAAGAC C CAGAAGT AT C AGC C CC C AT CT AC CAACAAGAACACGAAGT CT CA G 297 

I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 271 AT GCC CAAGACT CAGAAGTAT C AGC CT C CAT CT ACCAACAAGAAAAT GAAGTCT CAGAGG 330 

Qy 298 AGAAGGAAAGGAAGTACATTTGAAGJ°ACACAAGTAGAGGGAGTGCAGGAAACAAGAACTA 357 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 331 AGAAGGAAAGGAAGTACATTTGAAGAACACAAGTAGAGGGAGTGCAGGAAACAAGAACTA 390 

Qy 358 CAGGAT GTA- GAAGACCCTT CT GAGGAGTGAAGAAGGACAGGCCACCGCAGGACCCTTTG 416 

I I I I I I I I I II I I I I I I I I I I I I I I I I I || I || I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 391 CAGGAT GT AGGAAGAC CCT T CT GAG GAGT GAAGAAGGACAGGCCACCGCAGGAC C CT TT G 450 

Qy 417 CT CT GCAC AGT T AC CT GT AAACAT T GGAATACCGGCCAAAAAATAAGT TT GAT C ACAT T T 47 6 

I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I II I M I I I I I I I I I I I I I I I I I I I I 
Db 4 51 CT CTGCACAGTTACCT GTAAACATTGGAATACCGGCCAAAAAATAAGTTT GATCACATTT 510 

Qy 477 CAAAGAT GGCAT T T C C C C CAAT GAAATACACAAGT AAACAT 517 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I 
Db 511 CAAAGAT GGCAT T T CC C C CAAT GAAATACACAAGT AAACAT 551 



RESULT 6 
AAD06405 

ID AAD064 05 standard; cDNA; 471 BP. 
XX 

AC AAD06405; 
XX 

DT 10-AUG-2001 (first entry) 
XX 

DE Rabbit liver-type IGF-I isoform (L. IGF-I) cDNA. 
XX 

KW Rabbit; IGF-I isoform; Insulin-like Growth Factor-I; MGF; 

KW mechano-growth factor; neurological disorder; neurodegenerative disorder; 

KW amyotrophic lateral sclerosis; spinal muscular atrophy; muscular atrophy; 



KW poliomyelitis; post-polio syndrome; toxin; motoneurone disorder; 

KW nerve damage; autosomal muscular dystrophy; diabetic neuropathy; 

KW sex-linked muscular dystrophy; peripheral neuropathy; 

KW Alzheimer's disease; Parkinson's disease; liver; L.IGF-I; ss. 

XX 

OS Oryctolagus cuniculus . 



XX 

FH Key Location/Qualifiers 

FT CDS 1..318 

FT /*tag= a 

FT /product= "Liver-type IGF-I isoform (L.IGF-I)" 

FT /transl_except= (pos:7..9, aa:Gln) 

FT /transl_except= (pos : 25 . . 27 , aa:Gln) 

FT /note= "These translation exceptions occur while decoding 

FT the alternative version of the protein (AAE02456) . 

FT The CDS comprises exons 3, 4 and 6 and 

FT does not include start codon" 

FT /partial 

XX 



PN WO200136483-A1. 
XX 

PD 25-MAY-2001. 
XX 

PF 15-NOV-2000; 2000WO-GB04354 . 
XX 

PR 15-NOV-1999; 99GB-002 6968 . 
XX 

PA (UNLO ) UNIV COLLEGE LONDON. 
XX 

PI Goldspink G, Johnson I; 
XX 

DR WPI; 2001-355620/37. 

DR P-PSDB; AAE02452, AAE02456. 

XX 

PT Use of mechano-growth factor, an isoform of Insulin-like Growth 

PT Factor-I, capable of reducing motoneurone loss, in the manufacture of a 

PT medicament for the treatment of neurological disorder - 

XX 

PS Disclosure; Page 59-60; 66pp; English. 
XX 

CC The present invention relates to use of mechano-growth factor (MGF) , 

CC an Insulin-like Growth Factor-I (IGF-I) isoform in the manufacture of a 

CC medicament for the treatment of neurological disorder. The MGF is capable 

CC of reducing motoneurone loss by 20% or greater in response to nerve 

CC avulsion, and effects motoneurone rescue, preferably adult motoneurone 

CC rescue. The MGF polynucleotide and polypeptide are useful in the 

CC manufacture of a medicament for the treatment of a neurological disorder, 

CC including a disorder of motoneurones and/or neurodegenerative disorder, 

CC e.g., amyotrophic lateral sclerosis, spinal muscular atrophy, progressive 

CC spinal muscular atrophy, infantile or juvenile muscular atrophy, 

CC poliomyelitis or post-polio syndrome, a disorder caused by exposure to a 

CC toxin, motoneurone trauma, a motoneurone lesion or nerve damage, an 

CC injury that affects motoneurones, motoneurone loss associated with aging, 

CC autosomal or sex-linked muscular dystrophy, diabetic neuropathy, 

CC peripheral neuropathies, Alzheimer ! s disease and Parkinson's disease. 

CC The present sequence is rabbit liver-type IGF-I isoform (L.IGF-I) cDNA. 

CC The L.IGF-I protein comprises amino acid sequences encoded by 



CC nucleic acid sequence of IGF-I exons 4 and 6. 
XX 

SQ Sequence 471 BP; 132 A; 118 C; 131 G; 90 T; 0 other; 

Query Match 73.0%; Score 377.2; DB 22; Length 471; 

Best Local Similarity 87.8%; Pred. No. 3.8e-104; 

Matches 455; Conservative 0; Mismatches 13; Indels 50; Gaps 2; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 GGACCGGAGACGCTCTGCGGTGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 AGGGG CT T TT AT T T CAACAAGC C CAC AGGAT ACGGCTC C AGCAGT C GGAGGGC AC CT CAG 120 

Qy 121 ACAGGCAT C GT GGAT GAGT GCT GCT T C C GGAGCT GTGAT CT AAGGAGGCT G GAGAT GT AT 18 0 

I I I I I I I I I I I M I I I I I I I I I I I! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 AC AG GC AT C GT GGAT GAGT GCTGCTTC C GGAGCT GT GAT CT GAGGAGGCT GGAGAT GT AC 180 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

II I I I I I I I I I I I I I I II III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 TGTGCACCCCTCAAGCCGGCAAAGGCAGCCCGCTCCGTCCGTGCCCAGCGCCACACCGAC 240 

Qy 241 AT GC C CAAGAC C C AGAAGT AT C AG CC C C C ATCT AC CAACAAGAACAC GAAGT CT CAGAGA 300 

I I I I I I I I I I I III 

Db 241 AT G C C C AAGACT CAG 255 

Qy 301 AGGAAAGGAAGT AC AT T T GAAGAAC ACAAGTAGAGGGAGT GCAGGAAACAAGAACT ACAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 256 AAGGAAGTACAT TT GAAGAAC ACAAGTAGAGGGAGT GCAGGAAACAAGAACTACAG 311 

Qy 361 GAT GT A- GAAGAC CCT T CT GAGGAGT GAAGAAGGAC AGGC CAC C GC AGGAC CCTTTGCTC 419 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 

Db 312 GAT GT AG GAAGAC CCT T CT GAGGAGTGAAGAAGGACAGGC CAC C GC AGGAC CCTTTGCTC 371 

Qy 420 T GCACAGTTACCT GTAAACATT GGAATACCGGCCAAAAAATAAGTTTGATCACATTTCAA 479 

I I I I I I I I I I I I I I I N I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 372 T GCACAGTTACCT GTAAACATT GGAATACCGGCCAAAAAATAAGTTT GAT CACATTT CAA 431 

Qy 480 AGAT GGCATT T C C C C CAAT GAAATACACAAGTAAAC AT 517 

I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I 
Db 432 AGAT GGC ATT T C CC C CAAT GAAATACACAAGTAAAC AT 4 69 



RESULT 7 
AAS16884 

ID AAS16884 standard; cDNA; 471 BP. 
XX 

AC AAS16884; 
XX 

DT 25-FEB-2002 (first entry) 
XX 



DE Rabbit insulin-like growth factor I liver-type isoform (L. IGF-I) cDNA. 
XX 

KW Rabbit; mechano-growth factor; insulin-like growth factor I; IGF-I; MGF; 

KW neuroprotective; nerve damage; peripheral nervous system; nerve severing; 



KW muscle; neurological disorder; motoneuron loss; motorneuron disorder; ss; 

KW nerve avulsion; insulin-like growth factor I liver-type isoform; L.IGF-I; 
XX 

OS Oryctolagus cuniculus . 



XX 

FH Key Location/Qualifiers 

FT CDS 1. .318 

FT /*tag= a 

FT /product^ "Rabbit L.IGF-I" 

FT /partial 

FT /note= "No start codon" 

FT exon 1. .75 

FT /*tag= b 

FT / number= exon 3 

FT exon 76. .258 

FT /*tag= c 

FT / number- exon 4 

FT exon 259.. 315 

FT /*tag= d 

FT /number= exon 6 

XX 



PN WO200185781-A2 . 
XX 

PD 15-NOV-2001. 
XX 

PF 10-MAY-2001; 2001WO-GB02054 . 
XX 

PR 10-MAY-2000; 2000GB-0011278 . 
XX 

PA (UNLO ) UNIV COLLEGE LONDON. 

PA (EGRI-) EAST GRINSTEAD MEDICAL RES TRUST. 

XX 

PI Goldspink G, Terenghi G; 
XX 

DR WPI; 2002-055585/07. 

DR P-PSDB; AAU10564. 
XX 

PT Use of insulin-like growth factor I (IGF-I) isoform known as 

PT mechano-growth factor which is encoded by IGF-I exons 4,5,6 and has 

PT ability to reduce motoneuron loss in response to nerve avulsion, to 

PT treat nerve damage 

XX 

PS Disclosure; Fig 10; 65pp; English. 
XX 

CC The invention relates to the use of an insulin-like growth factor I 

CC (IGF-I) isoform, known as mechano-growth factor (MGF) , in the manufacture 

CC of a medicament for treating nerve damage in the peripheral nervous 

CC system, or for treating nerve damage by localising MGF at the site of 

CC damage. The nerve damage may include severing of a nerve. The treatment 

CC may be combined with another treatment (such as a polypeptide growth 

CC factor other than MGF) that prevents or diminishes degeneration of the 

CC target organ (for example, muscle) which the damaged nerve innervates, 

CC whereby the treatment of the muscle with MGF or a polynucleotide encoding 

CC MGF prevents or diminishes degeneration. The method is useful for 

CC treating neurological disorders, preferably motorneuron disorders. These 

CC methods can reduce motoneuron loss by 20% or greater in response to nerve 

CC avulsion. This sequence represents cDNA encoding the rabbit insulin-like 



CC growth factor I liver-type isoform (L.IGF-I) used in experiments on 

CC motoneuron loss. 

XX 

SQ Sequence 471 BP; 132 A; 118 C; 131 G; 90 T; 0 other; 

Query Match 73.0%; Score 377.2; DB 24; Length 471; 

Best Local Similarity 87.8%; Pred. No. 3.8e-104; 

Matches 455; Conservative 0; Mismatches 13; Indels 50; Gaps 2; 

Qy 1 GGAC CGGAGACGCT CT GCGGGGCT GAGCT GGT GGAT GCT CTT CAGTT CGT GT GTGGAGAC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I 

Db 1 GGACCGGAGACGCTCTGCGGTGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I II I I II I I I I I I I I I M II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 AGGGGCTTT T AT T T CAACAAGC C CACAGGATAC GGCT C CAGC AGT CGGAGGGC AC CT C AG 120 

Qy 121 ACAGGCATCGTGGAT GAGTGCT GCTTCC GGAGCT GTGATCTAAGGAGGCT GGAGATGTAT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 

Db 121 ACAGGCATCGTGGAT GAGTGCT GCTTCC GGAGCT GTGAT CT GAGGAGGCT GGAGATGTAC 18 0 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 24 0 

II I I I I I I I I I I I I I I II III I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 

Db 181 TGTGCACCCCTCAAGCCGGCAAAGGCAGCCCGCTCCGTCCGTGCCCAGCGCCACACCGAC 240 

Qy 241 ATGCCCAAGACCCAGAAGTAT CAGCCCCCAT CTACCAACAAGAACACGAAGT CT CAGAGA 300 

I I I I I I I I I I I III 

Db 241 AT GC C CAAGACT CAG 255 

Qy 301 AGGAAAGGAAGTACATTTGAAGAACACAAGTAGAGGGAGTGCAGGAAACAAGAACTACAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 256 AAGGAAGTACATTT GAAGAACACAAGTAGAGGGAGT GCAGGAAACAAGAACTACAG 311 

Qy 361 GAT GT A- GAAGAC C CTTCT GAGGAGT GAAGAAG GACAGGC CAC C GC AGGAC CCTTTGCTC 419 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 312 GAT GTAGGAAGACC CTTCT GAGGAGT GAAGAAGGACAGG CCACC GC AGGAC CCTTTGCTC 371 

Qy 420 TGCACAGTTACCTGTAAACATT GGAAT ACCGGCCAAAAAATAAGTTTGAT CACATTT CAA 479 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 372 T GCACAGTTACCT GTAAACATT GGAAT ACCGGCCAAAAAATAAGTTT GAT CACATTT CAA 431 

Qy 48 0 AGAT GGCAT T T C C C C CAAT GAAAT AC AC AAGT AAAC AT 517 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I 
Db 432 AGAT G G CAT T T C C C C CAAT GAAAT AC ACAAGT AAAC AT 469 



RESULT 8 
AAN70436 

ID AAN70436 standard; cDNA; 818 BP. 
XX 

AC AAN70436; 
XX 

DT 25-MAR-2003 (updated) 

DT 05-APR-1991 (first entry) 

XX 

DE Sequence encoding insulin-like growth factor 1A (IGF-1A) . 
XX 



KW Growth promoter; lactation enhancer; cell proliferation; ss. 
XX 

OS Homo sapiens. 
XX 

PN EP229750-A. 
XX 

PD 22-JUL-1987. 
XX 

PF 06-JAN-1987; 87EP-0870001 . 
XX 

PR 20-NOV-1986; 86US-0929671 . 

PR 07-JAN-1986; 86US-0816662 . 
XX 

PA (UNIW ) UNIV WASHINGTON. 
XX 

PI Krivi GG, Rotwein PS; 
XX 

DR WPI; 1987-200203/29. 
XX 

PT New pre-pro-insulin-like growth factor-1 protein - obtd. by 

PT recombinant DNA procedures for use as growth promoters for 

PT enhancing lactation, for stimulating cell proliferation etc. 
XX 

PS Example; Fig 5; 59pp; English. 
XX 

CC A 42 base oligonucleotide corresponding to the DNA sequence encoding 

CC amino acids 10 to 23 of mature human IGF-I was synthesized (AAN70437) . 

CC The radiolabeled 42 mer was then employed to screen for IGF-I 

CC containing DNA sequences in a human liver cDNA library. Insulin- 

CC like growth factors-lA and -IB cDNAs were isolated from a human cDNA 

CC library by using lambdagt 11 (AAN70435, AAN70436) . The human IGF-1 

CC genomic gene was isolated and mapped. It encodes at least two 

CC preproinsulin-like growth factor-1 proteins. An essentially pure 

CC proproinsulin-like growth factor-1 protein comprising the sequence 

CC of amino acids shown in Figure six is claimed (AAP70277) . 

CC (Updated on 25-MAR-2003 to correct PA field.) 
XX 

SQ Sequence 818 BP; 232 A; 186 C; 187 G; 213 T; 0 other; 

Query Match 66.6%; Score 344.2; DB 8; Length 818; 

Best Local Similarity 87.3%; Pred. No. 4.8e-94; 

Matches 455; Conservative 0; Mismatches 13; Indels 53; Gaps 5; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 203 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 262 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 263 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 322 

Qy 121 ACAGGCAT CGTGGAT GAGTGCT GCTTCCGGAGCT GTGAT CTAAGGAGGCT GGAGATGTAT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 323 ACAGGCAT CGTGGAT GAGTGCT GCTTCCGGAGCT GTGAT CTAAGGAGGCT GGAGATGTAT 382 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 



38 3 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 442 



Qy 



Db 



241 ATGCCCAAGACCCAGAAGTATCAGCCCCCATCTACCAACAAGAACACGj^AGTCTCAGAGA 300 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

443 ATGCCCAAGACCCAG 457 



Qy 



Db 



301 AGGAAAGGAAGTACATTTGAAGAACACAAGTAGAGGGAGTGCAGGAAACAAGAACTACAG 360 

I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
458 AAGGAAGTACATTTGAAGAACGCAAGTAGAGGGAGTGCAGGAAACAAGAACTACAG 513 



Qy 



361 GATGTA-GAAGACCCTTCTGAGGAGTGAAGAAGGACAGGCCACCGCAGGACCCTTTGCTC 419 




Db 



514 GAT GT AGGAAGAC C CT C CT GAGGAGT GAAGAGT GAC AT GC CAC C GCAGGAT CCTTTGCTC 573 



Qy 



420 T GCAC - AGTT AC CT G- TAAACAT T GGAAT AC C GGC CAAAAAAT AAGTTT GAT C ACATTT C 477 




Db 



57 4 T GCAC GAGT T AC CT GT T AAACT TT GGAAC AC CT AC CAAAAAAT AAGT T T GAT AACATTTA 633 



Qy 



478 AAAGAT-GGCATTTCCCCCAATGAAATACACAAGTAAACAT 517 




Db 



634 AAAGAT GGGCGTTTCCCC C AAT GAAAT AC ACAAGT AAAC AT 674 



RESULT 9 
ABT11091 

ID ABT11091 standard; cDNA; 7260 BP. 
XX 

AC ABT11091; 
XX 

DT 04-DEC-2002 (first entry) 
XX 

DE Human breast cancer associated coding sequence SEQ ID NO: 1225. 
XX 

KW Human; breast specific gene; breast cancer; differential expression; 

KW cytostatic; gene therapy; gene; ss . 

XX 

OS Homo sapiens . 
XX 

PN WO200259271-A2. 
XX 

PD 01-AUG-2002. 
XX 

PF 25-JAN-2002; 2002WO-US02176 . 
XX 

PR 25-JAN-2001; 2001US-263757P. 

PR 25-APR-2001; 2001US-286090P. 

PR 23-MAY-2001; 2 001US-292517P . 
XX 

PA (GENE- ) GENE LOGIC INC. 
XX 

PI Orr MS, Nation M, Diggans JC, Zeng W; 
XX 

DR WPI; 2002-674803/72. 
XX 

PT Diagnosing breast cancer in a patient comprises detecting the level of 

PT gene expression in cell or tissue samples, where a differential gene 

PT expression is indicative of breast cancer 



PS Claim 1; SEQ ID NO 1225; 260pp + Sequence Listing; English. 
XX 

CC The present invention relates to methods of diagnosing breast cancer in a 

CC patient, which comprise detecting the level of expression in a tissue 

CC sample of two or more genes selected from those shown in ABT09867- 

CC ABT11112, where a differential expression of the genes indicates breast 

CC cancer. The methods are useful in diagnosing, treating, detecting the 

CC progression, and in monitoring treatment of breast cancer in patients. 

CC The methods are also useful as a screening tool for agents that modulate 

CC the onset or progression of breast cancer. The breast cancer genes may be 

CC used as diagnostic markers for the prediction or identification of the 

CC malignant state of breast tissue, for confirming the type and progression 

CC of cancer, and for drug screening and assays. The present sequence is a 

CC coding sequence of the invention. 

CC Note: The sequence data for this patent did not form part of the printed 

CC specification, but was obtained in electronic format directly from WIPO 

CC at ftp.wipo.int/pub.published_pct_sequences. 
XX 

SQ Sequence 7260 BP; 2330 A; 1415 C; 1240 G; 2275 T; 0 other; 



Query Match 66.6%; Score 344.2; DB 24; Length 7260; 

Best Local Similarity 87.3%; Pred. No. l.le-93; 

Matches 455; Conservative 0; Mismatches 13; Indels 53; Gaps 5 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | 
Db 311 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 370 

QY 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I | | | | | 
Db 371 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 430 

Qy 121 ACAGGCAT C GTGGAT GAGTGCTGCTTCCGGAGCTGT GAT CTAAGGAGGCT GGAGAT GTAT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | 
Db 431 ACAGGCAT C GT GGAT GAGT GCT GCTTCCGGAGCT GT GAT CTAAGGAGGCT GGAGAT GTAT 4 90 

QY 181 TGCGCACCCCTC7VAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 

Db 4 91 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 550 

Qy 241 ATGCCCAAGACCCAGAAGTATCAGCCCCCAT CTACCAACAAGAACACGAAGT CTCAGAGA 300 

I I I I I I I I I I I I I I I 

Db 551 AT G C C C AAG AC C CAG 565 

Qy 301 AGGAAAGGAAGTACATTTGAAGAACACAAGTAGAGGGAGTGCAGGAAACAAGAACTACAG 360 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I | | | 
Db 566 AAGGAAGT ACAT TTGAAGAAC GCAAGT AGAG GGAGT GC AGGAAACAAGAACT ACAG 621 

Qy 361 GAT GTA- GAAGAC C CTT CTGAGGAGT GAAGAAGGACAGGC C AC CGC AGGAC C CTTTGCT C 419 

MMM I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I || I I I I I I I I I M I 
Db 622 GAT GTAGGAAGACC CT CCTGAGGAGT GAAGAGT GAC AT GC C AC C GCAGGAT C CTTT GCT C 681 

Qy 420 T GCAC - AGT T AC CTG- T AAAC AT T G GAAT AC C GGC CAAAAAATAAGTTT GAT C ACATT T C 477 

Mill I I I I I I M I I I M I I I I I I I IN I I I I I I I I I I I I II I I I I I I I I I I 
Db 682 T GCAC GAGT T AC CT GTTAAACT T T GGAAC AC CT AC CAAAAAATAAGT TT GATAACATTT A 741 



Qy 478 AAAGAT - GGCATTTCCCCCAAT GAAATACACAAGTAAACAT 517 

I I I I I I IN I I I I I I I I I I I I I I I II I I II I I I I I II I I 
Db 742 AAAGAT GGGCGTTTCCCCCAAT GAAATACACAAGTAAACAT 782 



RESULT 10 
ABK84583 

ID ABK84583 standard; cDNA; 7260 BP. 
XX 

AC ABK84583; 
XX 

DT 14-AUG-2002 (first entry) 
XX 

DE Human cDNA differentially expressed in granulocytic cells #1154. 
XX 

KW Human; ss; granulocytic cell; DNA chip; bacterial infection; 

KW viral infection; parasitic infection; protozoal infection; 

KW fungal infection; sterile inflammatory disease; psoriasis; 

KW rheumatoid arthritis; glomerulonephritis; asthma; thrombosis; 

KW cardiac reperfusion injury; renal reperfusion injury; ARDS; 

KW adult respiratory distress syndrome; inflammatory bowel disease; 

KW Crohn's disease; ulcerative colitis; periodontal disease; 

KW granulocyte activation; chronic inflammation; allergy. 

XX 

OS Homo sapiens . 
XX 

PN WO200228999-A2. 
XX 

PD ll-APR-2002. 
XX 

PF 03-OCT-2001; 2001WO-US30821 . 
XX 

PR 03-OCT-2000; 2000US-237189P . 
XX 

PA (GENE- ) GENE LOGIC INC. 
XX 

PI Beazer-Barclay Y, Weissman SM, Yamaga S, Vockley J; 
XX 

DR WPI; 2002-435328/46. 
XX 

PT Detecting granulocyte activation by detecting differential expression 

PT of genes associated with granulocyte activation, which serves as 

PT diagnostic markers that is useful for monitoring disease states and 

PT drug toxicity 
XX 

PS Claim 1; SEQ ID No 1154; 114pp; English. 
XX 

CC The invention relates to detecting (Ml) granulocyte (GC) activation 

CC (GCA), by detecting the level of expression of gene(s) (Gs) identified by 

CC DNA chip analysis as given in the specification, and comparing 

CC the expression level to an expression level in an unactivated 

CC GC, where differential expression of Gs is indicative of GCA. 

CC Also included are modulating (M2) GA by contacting GC with an agent 

CC that alters the expression of at least one gene in Gs; (2) screening (M3) 

CC for an agent capable of modulating GCA or an inflammation (especially 

CC chronic) in a tissue, an allergic response in a subject, exposure of a 

CC subject to a pathogen or sterile inflammatory disease using the 



CC gene expression profile; (3) detecting (M4) an inflammation (especially 

CC chronic) in a tissue, an allergic response in a subject, exposure of a 

CC subject to a pathogen or sterile inflammatory disease, by detecting the 

CC level of expression in a sample of the tissue of gene(s) from Gs, where 

CC the level of expression of the gene is indicative of inflammation; 

CC (4) treating (M5) an inflammation (especially chronic) or in a tissue, 

CC an allergic response in a subject, exposure of a subject to a pathogen 

CC or sterile inflammatory disease, by contacting a tissue having 

CC inflammation with an agent that modulates the expression of gene(s) 

CC from Gs in the tissue. Ml is useful for detecting GCA; M2 is useful for 

CC modulating GA; M3 is useful for screening an agent capable of modulating 

CC GCA preferably in an inflammation in a tissue; M4 is useful for 

CC detecting an inflammation (especially chronic) in a tissue, an allergic 

CC response in a subject, exposure of a subject to a pathogen or sterile 

CC inflammatory disease (e.g. psoriasis, rheumatoid arthritis, 

CC glomerulonephritis, asthma, thrombosis, cardiac reperfusion injury, renal 

CC reperfusion injury, ARDS, adult respiratory distress syndrome, 

CC inflammatory bowel disease, Crohn's disease, ulcerative colitis, 

CC periodontal disease; also bacterial infection, viral infection, 

CC parasitic infection, protozoal infection, fungal infection and M5 is 

CC useful for treating one of the above conditions. The present 

CC sequence represents a gene differentially expressed in granulocytes. 

CC Note: The sequence data for this patent did not form part 

CC of the printed specification, but was obtained in electronic 

CC format directly from WIPO at 

CC ftp . wipo . int/pub/published_pct_sequences . 

XX 

SQ Sequence 7260 BP; 2330 A; 1415 C; 1240 G; 2275 T; 0 other; 



Query Match 66.6%; Score 344.2; DB 24; Length 7260; 

Best Local Similarity 87.3%; Pred. No. l.le-93; 

Matches 4 55; Conservative 0; Mismatches 13; Indels 53; Gaps 5; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 311 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 370 

QY 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I i I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I || I I 
Db 371 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 430 

Qy 121 ACAGGCAT CGT GGAT GAGT GCT GCTT C C GGAGCT GT GAT CTAAGGAGGCT GGAGAT GT AT 180 

I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I 
Db 431 AC AGGC AT CGT GGAT GAGT GCT GCTTC C GGAG CT GT GATCTAAGGAGGCT GGAGAT GT AT 4 90 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

I M I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 
Db 491 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 550 

Qy 241 AT G C C C AAG AC C C AGAAGT AT C AG C C C C CAT C T AC C AAC AAG AAC AC G AAGT C T C AGAG A 300 

I I I I I I I I I I I II I I 

Db 551 ATGCCCAAGACCCAG 565 

Qy 301 AGGAAAGGAAGTAC AT T T GAAGAACACAAGTAGAGGGAGTGCAG GAAACAAGAACT AC AG 3 60 

I I I I I I I I I M I M I M I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 566 AAGGAAGTACATTT GAAGAAC GCAAGT AGAGGGAGT GCAGGAAACAAGAACTACAG 621 



Qy 361 GAT GT A- GAAGAC CCT T CT GAGGAGT GAAGAAGGAC AGGC C AC C GCAGGAC CCTTTGCTC 419 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 622 GAT GT AGGAAGAC C CT CCT GAG GAGT GAAGAGT GACAT GCCAC C GCAGGAT CCTTTGCTC 681 

Qy 420 TGCAC- AGTTACCT G- TAAACATT GGAATACCGGCCAAAAAATAAGTTTGAT CACATTTC 477 

INN I I I I I I I I I I I I I I I I I I I I | | | I II I I I I I I I II I I II I I I I I I I I 
Db 682 TGCAC GAGT T ACCT GT TAAACT T T GGAAC AC CTAC CAAAAAATAAGTTT GATAAC AT TTA 741 

Qy 478 AAAGAT - G GCAT T T C C C C CAAT GAAAT ACACAAGTAAAC AT 517 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 742 AAAGAT GGGC GTTT C C C CCAAT GAAAT ACACAAGTAAAC AT 7 82 



RESULT 11 
ABN97244 

ID ABN97244 standard; DNA; 7260 BP. 
XX 

AC ABN97244; 
XX 

DT 13-AUG-2002 (first entry) 
XX 

DE Gene #3742 used to diagnose liver cancer. 
XX 

KW Gene; liver cancer; ds ; hepatocellular carcinoma; hepatotropic; 

KW metastatic liver tumour; cytostatic; expression profile; disease state; 

KW disease progression; drug toxicity; drug efficacy; drug metabolism. 

XX 

OS Homo sapiens. 
XX 

PN WO200229103-A2. 
XX 

PD ll-APR-2002. 
XX 

PF 02-OCT-2001; 2001WO-US30589 . 
XX 

PR 02-OCT-2000; 2000US-237054P . 
XX 

PA (GENE-) GENE LOGIC INC. 
XX 

PI Home D, Alvares C, Peres-Da-Silva S, Vockley JG; 
XX 

DR WPI; 2002-426119/45. 
XX 

PT Diagnosing and detecting the progression of liver cancer, 

PT hepatocellular carcinoma or metastatic liver tumor in a patient, 

PT involves detecting the level of expression of two or more genes in a 

PT liver tissue sample 

XX 

PS Claim 1; SEQ ID NO 3742; 298pp; English. 
XX 

CC The invention relates to a novel method for diagnosing and detecting the 

CC progression of liver cancer, hepatocellular carcinoma or metastatic liver 

CC tumour in a patient, and differentiating metastatic liver cancer from 

CC hepatocellular carcinoma in a patient, involving detecting the level of 

CC expression of two or more genes represented in ABN93503-ABN97455 in a 

CC tissue sample. The method of the invention has hepatotropic, and 

CC cytostatic activity. The method is useful for diagnosing and detecting 



CC the progression of liver cancer, hepatocellular carcinoma and metastatic 

CC liver carcinoma in a patient. The method is useful for identifying 

CC expression profiles which serve as useful diagnostic markers as well as 

CC markers that can be used to monitor disease states, disease progression, 

CC drug toxicity, drug efficacy and drug metabolism. 

CC Note: The sequence data for this patent did not form part of the printed 

CC specification, but was obtained in electronic format directly from WIPO 

CC at ftp . wipo . int/pub/published_pct_sequences . 
XX 

SQ Sequence 7260 BP; 2330 A; 1415 C; 1240 G; 2275 T; 0 other; 



Query Match 66.6%; Score 344.2; DB 24; Length 7260; 

Best Local Similarity 87.3%; Pred. No. l.le-93; 

Matches 455; Conservative 0; Mismatches 13; Indels 53; Gaps 5; 



Qy 


1 


GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 


60 






1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


311 


GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 


370 


Qy 


bl 


AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 


120 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


371 


AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 


430 


Qy 


±Z 1 


ACAGGCAT CGT GGAT GAGTGCT GCT T C C GGAGC T GT GAT CTAAGGAGGCT GGAGAT GT AT 


180 






1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | I | | | | | | | | | 




Db 


431 


ACAGGCATCGTGGATGAGTGCTGCTTCCGGAGCTGTGATCTAAGGAGGCTGGAGATGTAT 


490 


Qy 


i p i 

101 


1 bL-CjL.AL-CL.L- 1 CAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 


240 






I|l||t||l||||l|||1|l|||t1ltllllll1tllll»llllll4lll»* li 

1 1 M 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


491 


TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 


550 


Qy 


241 


AT G C C C AAG AC C C AGAAGT AT C AG C C C C CAT CT AC C AAC AAG AAC AC G AAG T C T C AG AG A 
1 1 1 1 II 1 1 1 1 1 1 1 1 1 


300 


Db 


551 




565 




Qy 


301 


AGGAAAGGAAGT AC AT T T GAAGAACACAAGT AGAGGGAGT GCAGGAAACAAGAACT ACAG 


360 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I I 




Db 


566 


AAGGAAGTACATTTGAAGAACGCAAGTAGAGGGAGTGCAGGAAACAAGAACTACAG 


621 


Qy 


361 


GAT GT A- GAAGAC C CTT CT GAGGAGT GAAGAAGGACAGGC C AC C GC AGGAC CCTTTGCTC 


419 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 




Db 


622 


GAT GT AGGAAGAC C CT CCT GAGGAGTGAAGAGT GAC AT GC CACC GCAG GAT CCTTTGCTC 


681 


Qy 


420 


T GCAC- AGTTACCTG-TAAACATTGGAATACCGGCCAAAAAATAAGTTT GAT CACATTT C 


477 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 III 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | 1 1 1 1 1 1 1 




Db 


682 


T GC AC GAGT T ACCT GT TAAACT T T GGAACACCT AC CAAAAAATAAGT T T GATAACAT TTA 


741 


Qy 


478 


AAAGAT - GGC AT T T C C C C CAAT GAAAT ACACAAGT AAACAT 517 








1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


742 


AAAGAT GGGC GTT T C C CC CAAT GAAAT ACACAAGT AAACAT 782 





RESULT 12 
ABK64812 

ID ABK64812 standard; DNA; 7260 BP. 
XX 

AC ABK64812; 



XX 

DT 18-JUN-2002 (first entry) 
XX 

DE Human benign prostatic hyperplasia gene #707. 
XX 

KW Human; benign prostatic hyperplasia; BPH; prostate cancer; gene; ds . 
XX 

OS Homo sapiens. 
XX 

PN WO200212440-A2. 
XX 

PD 14-FEB-2002. 
XX 

PF 07-AUG-2001; 2001WO-US24708 . 
XX 

PR 07-AUG-2000; 2000US-223323P . 

PR 05-JUN-2001; 2001US-0873319 . 
XX 

PA (GENE-) GENE LOGIC INC. 

PA (NISB ) JAPAN TOBACCO INC. 
XX 

PI Munger WE, Kulkarni P, Getzenberg RH, Waga I, Yamamoto J; 
XX 

DR WPI; 2002-257476/30. 
XX 

PT Identifying drugs for and diagnosing benign prostatic hyperplasia, by 

PT detecting expression levels of one or more genes in prostate cells from 

PT patient that are differentially regulated compared to normal prostate 

PT cells - 
XX 

PS Disclosure; Page 391-393; 444pp; English. 
XX 

CC The invention relates to a method of diagnosing (I) the onset or 

CC progression of benign prostatic hyperplasia (BPH), or screening (II) for 

CC or identifying an agent that modulates the onset or progression of BPH. 

CC The method is based on changes in gene expression in BPH tissue isolated 

CC from patients exhibiting different clinical states of prostate 

CC hyperplasia as compared to normal prostate tissue. (I) comprises 

CC detecting the expression levels of one or more genes in prostate cells 

CC from the subject that are differentially regulated compared to normal 

CC prostate cells. (II) comprises preparing a first gene expression profile 

CC of BPH cells or BPH-like cell population, exposing the cells to the 

CC agent, preparing a second gene expression profile of the agent exposed 

CC cells, and comparing the first and second gene expression profiles. 

CC (I) is useful for diagnosing the onset or progression of BPH. (II) is 

CC useful for identifying an agent that modulates the onset or progression 

CC of BPH. The methods are useful to present information identifying 

CC the expression level in a tissue or cells, by comparing the expression 

CC level of genes given in the specification in the tissue or cells to the 

CC level of expression of gene in the database, and displaying the 

CC expression levels of at least one gene in the tissue or cell sample 

CC compared to the expression level in BPH. Agents using (II) are useful for 

CC treating BPH or prostate cancer. ABK64106-ABK64860 represent human 

CC benign prostatic hyperplasia gene sequences of the invention. 

XX 

SQ Sequence 7260 BP; 2330 A; 1415 C; 1240 G; 2275 T; 0 other; 



Query Match 66.6%; Score 344.2; DB 24; Length 7260; 

Best Local Similarity 87.3%; Pred. No. l.le-93; 

Matches 455; Conservative 0; Mismatches 13; Indels 53; Gaps 5; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I M I I II I I I I I I I I I I I I I M | | | | | | | | | | | M | | | | M I I I I I I I I I I I I I I I I I I I 
Db 311 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 37 0 

QY 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I 
Db 371 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 430 

QY 121 ACAG GC AT C GT GGATGAGT GCT GCT T C C GGAGCT GT GAT CTAAGGAGGCTGGAGAT GTAT 18 0 

I N I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | II I I I I I I I I I I I I I I I I I I II 

Db 431 ACAGGC AT C GT GGAT GAGT GCT GCT T C C GGAGCT GT GAT CTAAGGAGGCT GGAGAT GTAT 4 90 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | 
Db 4 91 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 550 

Qy 241 AT GC C CAAGAC C CAGAAGTATCAGCCC C C AT CT AC CAACAAGAAC AC GAAGT CTCAGAGA 300 

I I I I I I I I I I I I I I I 

Db 551 AT GC C CAAGAC C CAG 565 

Qy 301 AGGAAAGGAAGT ACAT T T GAAGAACACAAGT AGAGGGAGT GCAGGAAACAAGAACT ACAG 360 

I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I 
Db 566 AAGGAAGT AC AT T T GAAGAACGCAAGTAGAGGGAGT GCAGGAAACAAGAACTACAG 621 

Qy 361 GAT GT A- GAAGAC C CT T CT GAGGAGT GAAGAAGGAC AGGC C AC C GC AGGAC C CTTT GCT C 419 

I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 622 GAT GT AGGAAGAC C CT C CT GAGGAGTGAAGAGT GAC AT GC C AC C GCAGGAT CCTTTGCTC 681 

Qy 420 T GCAC - AGT T AC CT G- TAAACAT T GGAATACC GGC CAAAAAATAAGT TT GAT CAC AT T T C 477 

I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I 
Db 682 T GCAC GAGTT ACCT GT TAAACT TT GGAACACCTAC CAAAAAATAAGTT T GATAACAT TT A 741 



Qy 


478 AAAGAT - GGCATTT CC C C C AAT GAAAT AC ACAAGT AAACAT 


517 




1 1 1 1 1 1 Ml 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 




Db 


742 AAAGAT GGGCGTTTCCCC C AAT GAAAT AC AC AAGT AAACAT 


782 


RESULT 13 




ABK35504 




ID 


ABK35504 standard; DNA; 7260 BP. 




XX 






AC 


ABK35504; 




XX 






DT 


08-MAY-2002 (first entry) 




XX 






DE 


Human endometrial cancer related gene, IGF1. 




XX 






KW 


Human; ds; gene; endometrial cancer; differential expression; 


KW 


DNA microarray; protein microarray . 




XX 






OS 


Homo sapiens. 




XX 






PN 


WO200209573-A2. 





PD 07-FEB-2002. 
XX 

PF 31-JUL-2001; 2001WO-US24104 . 
XX 

PR 31-JUL-2000; 2000US-221735P . 
XX 

PA (BGHM ) BRIGHAM & WOMENS HOSPITAL INC. 
XX 

PI Mutter GL; 
XX 

DR WPI; 2002-179967/23. 

DR P-PSDB; AAU84284. 
XX 

PT Diagnosing endometrial cancer comprises determining expression of 

PT nucleic acid molecules or expression products that are differentially 

PT expressed in normal and malignant endometrium - 

XX 

PS Claim 1; Page 85-89; 233pp; English. 
XX 

CC The invention relates to diagnosing endometrial cancer in a subject 

CC suspected of having endometrial cancer comprising determining the 

CC expression of a set of nucleic acid molecules or expression products in 

CC an endometrial sample suspected of being cancerous, where the set of 

CC nucleic acid molecules comprises at least 2 nucleic acid molecules 

selected from 50 fully defined sequences as given in the specification. 
The nucleic acids are used as an array of at least 2 of the 50 
nucleic acids bound to a solid substrate. Also included is a solid-phase 
protein microarray comprising at least 2 antibodies or its antigen 

CC binding fragments, that specifically bind at least 2 different 

CC polypeptides from the 50 fully defined sequences as given in the 

CC specification, fixed to a solid substrate. The methods and arrays are 

CC useful for the diagnosis of endometrial cancer, selecting and monitoring 

CC treatment regimes and identification of lead compounds useful for the 

CC treatment of endometrial cancer. The present sequence is one of 50 

CC genes differentially expressed between cancerous and non-cancerous 

CC samples . 
XX 

SQ Sequence 7260 BP; 2330 A; 1415 C; 1240 G; 2275 T; 0 other; 



CC 
CC 
CC 
CC 



Query Match 66.6%; Score 344.2; DB 24; Length 7260; 

Best Local Similarity 87.3%; Pred. No. l.le-93; 

Matches 455; Conservative 0; Mismatches 13; Indels 53; Gaps 5 

QY 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
Db 311 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 370 

QY 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I | I | | | | | | | | | | | | | | | 
Db 371 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 430 

Qy 121 ACAGGCAT C GT GGAT GAGT GCT GCT T C C GGAGCT GT GAT CTAAGGAGGCT GGAGAT GT AT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 431 ACAGGCAT C GT GGAT GAGT GCT GCTTCC GGAGCT GT GAT CTAAGGAGGCT GGAGAT GT AT 4 90 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 24 0 



Db 4 91 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 550 

Qy 241 AT G C C CAAGAC C CAGAAGT AT C AGC C C C CAT CT AC CAACAAGAACAC GAAGT CT C AGAGA 300 

I I I I I I I I I I I I I I I 

Db 551 ATGCCCAAGACCCAG 565 

Qy 301 AGGAAAGGAAGTAC ATT T GAAGAAC ACAAGTAGAGG GAGT GCAGGAAACAAGAACT ACAG 360 

I I I I I > I I I I I I I I M I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 566 AAGGAAGTACAT T T GAAGAAC G CAAGTAGAGGGAGT GCAGGAAACAAGAACT AC AG 621 

Qy 361 GAT GTA- GAAGACCCTTCTGAGGAGT GAAGAAGGACAGGCCACCGCAGGACCCTTT GCT C 419 

I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 622 GAT GT AGGAAGACCCTC CT GAGGAGT GAAGAGT GAC AT GC C ACCGCAGGAT CCTTTGCTC 681 

Qy 420 T GCAC - AGT T AC CT G- T AAAC AT T GGAATACC GGC CAAAAAAT AAGTTT GAT CACATT T C 477 

I I I I I I I I I I I I I I I I I I I I I I I I I Ml I I II I I I I I I I I II I I I I I I I I I I 
Db 682 T GCAC GAGT T ACCT GTTAAACT T T GGAACAC CT AC CAAAAAAT AAGTTT GAT AAC AT TTA 741 

Qy 47 8 AAAGAT - GGCATTTCCC CCAAT GAAATACACAAGT AAACAT 517 

I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I | II 
Db 742 AAAGAT GGGCGTTTCCCC C AAT GAAATACACAAGT AAAC AT 782 



RESULT 14 




ABK35561 




ID 


ABK35561 standard; DNA; 7260 BP. 




XX 






AC 


ABK35561; 




XX 






DT 


08-MAY-2002 (first entry) 




XX 






DE 


Gene IGF1 differentially expressed in breast cancer tissue. 


XX 






KW 


Human; diagnosis of breast cancer; endometrial 


cancer; breast tumour; 


KW 


MAI; mitotic activity index; cytostatic; gene; 


ds. 


XX 






OS 


Homo sapiens. 




XX 






PN 


WO200210436-A2. 




XX 






PD 


07-FEB-2002. 




XX 






PF 


27-JUL-2001; 2 001WO-US23642 . 




XX 






PR 


28-JUL-2000; 2000US-222093P . 




XX 






PA 


(BGHM ) BRIGHAM & WOMENS HOSPITAL INC. 




PA 


(BAAK/) BAAK J. 




XX 






PI 


Baak J, Mutter GL; 




XX 






DR 


WPI; 2002-180084/23. 




DR 


P-PSDB; AAU84341. 




XX 






PT 


Diagnosing breast cancer comprises determining 


expression of nucleic 


PT 


acid molecules or expression products that are 


differentially expressed 



PT in normal and malignant tissue - 
XX 

PS Claim 1; Page 74-78; 219pp; English. 
XX 

CC The present invention relates to a method for diagnosing breast cancer 

CC in a subject suspected of having endometrial cancer. The method 

CC comprises determining the expression of a set of human genes or 

CC expression products in an endometrial sample suspected of being 

CC cancerous. The human genes of the invention are differentially 

CC expressed in breast tumours characterised as high or low MAI (mitotic 

CC activity index) . These sets of genes can be used to discriminate between 

CC high and low MAI breast tumours. The invention also provides DNA and 

CC protein microarrays for analysing the expression of the human genes and 

CC their protein products. The methods and arrays are useful for the 

CC diagnosis and prognosis of endometrial cancer, selecting and monitoring 

CC treatment regimes, and identification of compounds useful for the 

CC treatment of endometrial cancer. ABK35531-ABK35581 represent the human 

CC genes of the invention that are differentially expressed in breast 

CC cancer tissue. 

XX 

SQ Sequence 7260 BP; 2330 A; 1415 C; 1240 G; 2275 T; 0 other; 

Query Match 66.6%; Score 344.2; DB 24; Length 7260; 

Best Local Similarity 87.3%; Pred. No. l.le-93; 

Matches 455; Conservative 0; Mismatches 13; Indels 53; Gaps 5; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
Db 311 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 370 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 371 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 430 

Qy 121 ACAGGCAT CGT GGAT GAGT GCTGCTTCCGGAGCTGT GAT CTAAGGAGGCTGGAGATGTAT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 431 ACAGGCAT C GTGGAT GAGTGCT GCTTCCGGAGCT GTGATCTAAGGAGGCT GGAGAT GTAT 4 90 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I | | I I I I I I I I I I I I 
Db 4 91 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 550 

Qy 241 AT GC C CAAGAC C C AGAAGT AT CAGC C C C CAT C T AC C AAC AAGAACACGAAGT CT CAGAGA 300 

I I I I I I I I I I I I I I I 

Db 551 AT GC C CAAGAC C CAG 565 

Qy 301 AGGAAAGGAAGT AC ATT T GAAGAAC ACAAGT AGAGGGAGT GCAGGAAACAAGAACT ACAG 360 



Db 



566 




621 



Qy 



361 



GATGTA- GAAGACC CTT CT GAGGAGTGAAGAAGGACAGGCCACCGCAGGAC CCTTT GCT C 



419 



Db 



622 




681 



Qy 



420 



T GCAC - AGT T AC CT G- TAAACATT GGAAT AC C GGC CAAAAAAT AAGT T T GAT C AC AT T T C 



477 



Db 



682 




741 



Qy 

Db 



478 AAAGAT - GGC AT TTC C C C CAAT GAAAT ACACAAGT AAAC AT 517 

I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
742 AAAGAT GGGCGTTTCCCC CAAT GAAAT AC AC AAGT AAAC AT 782 



RESULT 15 
AAT84894 

ID AAT84894 standard; cDNA; 777 BP. 
XX 

AC AAT84894; 
XX 

DT 14-APR-1998 (first entry) 
XX 

DE Human insulin like growth factor 1 Ea isoform encoding cDNA. 
XX 

KW Insulin like growth factor 1; IGF-1; Ec peptide; muscle disorder; 

KW heart; neuromuscular disease; ss. 

XX 

OS Homo sapiens . 
XX 

FH Key Location/Qualifiers 

FT CDS 26.. 496 

FT /*tag= a 

FT /product= "IGF-1 Ea isoform" 

XX 

PN W09733997-A1. 
XX 

PD 18-SEP-1997. 
XX 

PF ll-MAR-1997; 97WO-GB00658 . 
XX 

PR ll-MAR-1996; 96GB-0005124 . 
XX 

PA (UNLO ) ROYAL FREE HOSPITAL SCHOOL MED . 
XX 

PI Goldspink G; 
XX 

DR WPI; 1997-470877/43. 

DR P-PSDB; AAW23302. 
XX 

PT Use of insulin like growth factor I characterised by presence of Ec 

PT peptide - to treat humans or animals, particularly muscle disorders, 

PT heart conditions or neuromuscular diseases 
XX 

PS Disclosure; Fig 4; 33pp; English. 
XX 

CC A use of insulin like growth factor I (IGF-1) has been developed, and 

CC is characterised by the presence of the Ec peptide, or a functional 

CC equivalent, in the treatment or therapy of a human or animal. The IGF-1 

CC polypeptide can be used to treat muscular disorders, e.g. Duchenne or 

CC Becker muscular dystrophy, autosomal dystrophies and related progressive 

CC skeletal muscle weakness and wasting, muscle atrophy in ageing humans, 

CC spinal cord injury induced muscle atrophy and neuromuscular diseases, 

CC and cardiac disorders, e.g. diseases where promotion of cardiac muscle 

CC protein synthesis is a beneficial treatment, cardiomyopathies and acute 

CC heart failure or insult, specifically myocarditis or myocardial 



CC infarction. It can also be used to promote bone fracture healing and 

CC maintenance of bone in old age. The present sequence encodes human 

CC IGF-1 Ea isoform used in the present specification. 
XX 

SQ Sequence 777 BP; 201 A; 193 C; 204 G; 179 T; 0 other; 



Query Match - 66.3%; Score 342.6; DB 18; Length 777; 

Best Local Similarity 87.1%; Pred. No. 1.4e-93; 

Matches 454; Conservative 0; Mismatches 14; Indels 53; Gaps 5; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 179 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 238 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 239 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 298 

Qy 121 ACAGGC AT C GT GGAT GAGTGCT GCTT C C GGAGC T GT GAT CTAAGGAGGCTGGAGAT GT AT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I II 

Db 299 ACAGGT AT C GT GGAT GAGTGCT GCT T C C G GAG CT GT GAT C TAAGGAGGCT GGAGAT GT AT 358 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

I I I I I I I I I I I I I I II II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 359 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 418 

Qy 241 AT GC C CAAGAC C C AGAAGT AT CAGC C CC CAT CT AC CAACAAGAACAC GAAGTCTCAGAGA 300 

I I I I I I I I I I I I I M 

Db 419 AT GC C CAAGAC C CAG ______ . 433 

Qy 301 AGGAAAGGAAGT ACATT T GAAGAACACAAGT AGAGGGAGT GC AGGAAACAAGAACTACAG 360 

I I I I I I I I M I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 4 34 AAGGAAGTACAT TT GAAGAACGCAAGT AGAGGGAGT GCAGGAAACAAGAACT AC AG 489 

Qy 361 GAT GT A- GAAGAC C CTT CT GAGGAGT GAAGAAGGACAGGC CAC C GCAGGAC CCTTT GCT C 419 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 4 90 GAT GT AGGAAGAC C CT C CT GAGGAGT GAAGAGT GAC AT GC CAC C GC AGGAT CCT TTGCT C 549 

Qy 420 T GCAC- AGTTACCT G- TAAACATTGGAATACC GGCCAAAAAATAAGTTTGAT CACATTTC 477 

I M I I I I I I I I I I I I I I I I I I I I I I Ml I I I I I I I I I I I I I I I I I I I 

Db 550 T GCAC GAGTT AC CT GTTAAACTTT GGAAC AC CT AC CAAAAAATAAGTT T GATAACATTT A 609 

Qy 47 8 AAAGAT - GGCAT T T C C C C CAAT GAAAT ACACAAGTAAACAT 517 

I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 610 AAAGAT GGGCGTTTCCCC CAAT GAAAT ACACAAGTAAACAT 650 

Search completed: December 13, 2003, 06:03:48 
Job time : 209.586 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2003 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 



Run on : 



Title: 

Perfect score: 
Sequence : 

Scoring table : 



December 13, 2003, 06:03:55 ; Search time 47.8037 Seconds 

(without alignments) 
4773.589 Million cell updates/sec 

US-09-852-261-1 
517 

1 ggaccggagacgctctgcgg tgaaatacacaagtaaacat 517 

I DENT IT Y_NUC 

Gapop 10.0 , Gapext 1.0 



Searched: 569978 seqs, 220691566 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



1139956 



Database : Issued__Patents_NA: * 

1: /cgn2_6/ptodata/l/ina/5A_COMB. seq: * 

2 : /cgn2_6/ptodata/l/ina/5B_COMB. seq: * 

3: /cgn2_6/ptodata/l/ina/6A_COMB. seq: * 

4 : /cgn2_6/ptodata/l/ina/6B_COMB. seq: * 

5: /cgn2_6/ptodata/l/ina/PCTUS_COMB. seq: * 

6 : / cgn2_6/ptodata/l/ina/backf ilesl . seq: * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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RESULT 1 

US-09-142-583A-3 

; Sequence 3, Application US/09142583A 

; Patent No. 6221842 

; GENERAL INFORMATION : 

APPLICANT: GOLDSPINK, GEOFFREY 

TITLE OF INVENTION: METHOD OF TREATING MUSCULAR DISORDERS 
NUMBER OF SEQUENCES: 11 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: NIXON & VANDERHYE P.C. 
; STREET: 1100 NORTH GLEBE ROAD 

; CITY: ARLINGTON 

STATE: VA 
COUNTRY: USA 
; ZIP: 22201 

; COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 



OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/142, 583A 

FILING DATE: 29-Oct-1998 

CLASSIFICATION: <Unknown> 
; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: WO PCT/GB97/00658 

FILING DATE: ll-MAR-1997 

APPLICATION NUMBER: GB 9605124.8 
; FILING DATE: ll-MAR-1996 

ATTORNEY/ AGENT INFORMATION: 

NAME : SADOFF, B . J. 

REGISTRATION NUMBER: 36663 

REFERENCE/ DOCKET NUMBER: 117-263 
; TELECOMMUNICATION INFORMATION: 

TELEPHONE: 7038164000 

TELEFAX: 7038164100 
INFORMATION FOR SEQ ID NO: 3: 
; SEQUENCE CHARACTERISTICS: 

LENGTH: 553 base pairs 

TYPE: nucleic acid 

STRANDEDNESS: both 
; TOPOLOGY: linear 

MOLECULE TYPE: cDNA 
FEATURE : 

NAME/ KEY: CDS 
; LOCATION: 1..363 

SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
US-09-142-583A-3 

Query Match 90.4%; Score 467.4; DB 3; Length 553; 

Best Local Similarity ,96.2%; Pred. No. 1.2e-134; 

Matches 501; Conservative 0; Mismatches 16; Indels 4; Gaps 2; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I 
Db 31 GGACCGGAGACGCTCTGCGGTGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 90 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I 

Db 91 AGGGGCTTTTATTTCAACAAGCCCACAGGATACGGCTCCAGCAGTCGGAGGGCACCTCAG 150 

Qy 121 ACAGGCAT CGT GGATGAGT GCT GCTT CCGGAGCT GT GAT CT AAGGAGGCTGGAGAT GTAT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | I I I I I I I I I II II I I I I I I 
Db 151 ACAGGCAT CGT GGATGAGTGCTGCTTCCGGAGCTGT GATCTGAGGAGGCT GGAGATGTAC 210 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

M I I I I I I I I I I I I I I II III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 211 TGTGCACCCCTCAAGCCGGCAAAGGCAGCCCGCTCCGTCCGTGCCCAGCGCCACACCGAC 270 

Qy 241 ATGC CCAAGACCCAGAAGTAT CAGCCCCCATCTACCAACAAGAACACGAAGT CT CA G 2 97 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 271 AT GC C CAAGACT CAGAAGT AT C AGC CT CCAT CTAC CAACAAGAAAAT GAAGT CT C AGAGG 330 

Qy 298 AGAAGGAAAGGAAGT ACAT T T GAAGAAC ACAAGT AGAGGGAGT G C AGGAAACAAGAACT A 357 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 331 AGAAGGAAAGGAAGTACATTTGAAGAACACAAGTAGAGGGAGTGCAGGAAACAAGAACTA 390 

Qy 358 CAG GAT GT A- GAAGAC C CT T CT GAGGAGT GAAGAAGGAC AG GC C AC C GCAGGAC C CTTT G 416 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 391 CAGGAT GT AGGAAGAC C CT T C T GAGGAGT GAAGAAGGACAGG C CAC C GCAGGAC C CTTT G 450 

QY 417 CT CT G CACAGT T AC CT GT AAACAT T GGAAT AC C GGC CAAAAAAT AAGT TT GAT CACATTT 476 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 451 CT CT GC ACAGT T AC CT GTAAAC AT T GGAAT AC C GGC CAAAAAAT AAGTT T GAT CACATTT 510 

Qy 477 C AAAG AT G G CAT T T C C C C CAAT GAAAT AC ACAAGT AAACAT 517 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 511 C AAAGAT G G CAT T T C C C C CAAT GAAAT AC AC AAGT AAACAT 551 



RESULT 2 

US-09-142-583A-5 

; Sequence 5, Application US/09142583A 
; Patent No. 6221842 

GENERAL INFORMATION: 

APPLICANT: GOLDSPINK, GEOFFREY 

TITLE OF INVENTION: METHOD OF TREATING MUSCULAR DISORDERS 
NUMBER OF SEQUENCES: 11 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: NIXON & VANDERHYE P.C. 

STREET: 1100 NORTH GLEBE ROAD 
CITY: ARLINGTON 
; STATE: VA 

COUNTRY: USA 
ZIP: 22201 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
; SOFTWARE: Patentln Release #1.0, Version #1.25 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/142 , 583A 
FILING DATE: 29-Oct-1998 
CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: WO PCT/GB97/00658 
; FILING DATE: ll-MAR-1997 

; APPLICATION NUMBER: GB 9605124.8 

FILING DATE: ll-MAR-1996 
ATTORNEY/AGENT INFORMATION: 
NAME: SADOFF, B. J. 
REGISTRATION NUMBER: 36663 
; REFERENCE/ DOCKET NUMBER: 117-263 

TELECOMMUNICATION INFORMATION: 
TELEPHONE: 7038164000 
TELEFAX: 7038164100 
; INFORMATION FOR SEQ ID NO: 5: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 553 base pairs 
TYPE: nucleic acid 
STRANDEDNESS: both 
TOPOLOGY: linear 



MOLECULE TYPE: cDNA 
FEATURE : 

NAME/ KEY: CDS 
LOCATION: 341.. 3 97 
; SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

US-09-142-583A-5 

Query Match 90.4%; Score 467.4; DB 3; Length 553; 

Best Local Similarity 96.2%; Pred. No. 1.2e-134; 

Matches 501; Conservative 0; Mismatches 16; Indels 4; Gaps 2; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I M > I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 31 GGACCGGAGACGCTCTGCGGTGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 90 

QY 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I M I I I I | | 

Db 91 AGGGGCTTTTATTTCAACAAGCCCACAGGATACGGCTCCAGCAGTCGGAGGGCACCTCAG 150 

QY 121 AC AGGCAT C GT GGAT GAGT GCTGCTTC C GGAGCT GT GAT CTAAGGAGGCT GGAGAT GTAT 180 

I I I I I I I I I I M I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I II I I I I II 
Db 151 AC AGGCAT C GT GGAT GAGT GCT GCT T C CGGAG CT GT GAT CT GAGGAGGCT GGAGAT GTAC 210 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 24 0 

M I II I I I I I I I I I I I II III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 211 TGTGCACCCCTCAAGCCGGCAAAGGCAGCCCGCTCCGTCCGTGCCCAGCGCCACACCGAC 270 

Qy 241 AT GCCCAAGACCCAGAAGTAT CAGCCCCCATCTACCAACAAGAACACGAAGT CT CA G 297 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
Db 271 AT G C C CAAGACT CAGAAGT AT CAGC CTCCAT CT AC CAACAAGAAAAT GAAGT CT CAGAGG 330 

QY 298 AGAAGGAAAGGAAGTACATTTGAAGAACACAAGTAGAGGGAGTGCAGGAAACAAGAACTA 357 

M I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I 
Db 331 AGAAGGAAAGGAAGT ACAT T T GAAGAACACAAGT AGAGGGAGT GC AGGAAACAAGAACT A 390 

QY 358 CAGGAT GT A- GAAGAC CCTT CT GAGGAGT GAAGAAGGAC AGGC CAC C GCAGGACC CTTT G 416 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I | | | | | | | | | | | | | | | | | | | | | | | || | | | | 
Db 391 CAGGAT GT AGGAAGACC CTT CT GAGGAGT GAAGAAGGACAGGC CAC C GC AGGAC C CTT T G 450 

Qy 417 CT CT GCAC AGTT ACCT GT AAAC ATTGGAAT AC C GGC CAAAAAATAAGT T T GAT CACAT TT 47 6 

I I I I I I M I I I I I I I I I I I M I I I I II I I I I I I I I I I I I I I I I I I I I | I | | | | | | | | | | | 
Db 451 CT CT GC AC AGT T AC CT GTAAACATTGGAAT AC C GGC CAAAAAATAAGTT T GAT CACAT T T 510 

Qy 477 CAAAGAT GGC AT T T C C CC CAAT GAAAT ACACAAGTAAACAT 517 

I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 511 CAAAGATGGCATTTCCCCCAATGAAAT ACACAAGTAAACAT 551 



RESULT 3 

US-09-142-583A-10 

; Sequence 10, Application US/09142583A 
; Patent No. 6221842 

GENERAL INFORMATION: 

APPLICANT: GOLDSPINK, GEOFFREY 

TITLE OF INVENTION: METHOD OF TREATING MUSCULAR DISORDERS 
NUMBER OF SEQUENCES: 11 
; CORRESPONDENCE ADDRESS: 



ADDRESSEE: NIXON & VANDERHYE P.C. 
; STREET: 1100 NORTH GLEBE ROAD 

CITY: ARLINGTON 

STATE: VA 

COUNTRY: USA 

ZIP: 22201 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 
; OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 09/ 142 , 5 83A 

FILING DATE: 29~Oct-1998 

CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: WO PCT/GB97/00658 
; FILING DATE: ll-MAR-1997 

APPLICATION NUMBER: GB 9605124.8 

FILING DATE: ll-MAR-1996 
ATTORNEY/AGENT INFORMATION: 

NAME: SADOFF, B. J. 
; REGISTRATION NUMBER: 36663 

REFERENCE/ DOCKET NUMBER: 117-263 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 7 038164000 

TELEFAX: 7038164100 
INFORMATION FOR SEQ ID NO: 10: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 777 base pairs 

TYPE: nucleic acid 

STRANDEDNESS: both 

TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE : 
; NAME/ KEY: CDS 

LOCATION: 2 6.. 4 93 
SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
US-09-142-583A-10 



Query Match 66.3%; Score 342.6; DB 3; Length 777; 

Best Local Similarity 87.1%; Pred. No. 4.4e-96; 

Matches 454; Conservative 0; Mismatches 14; Indels 53; Gaps 5; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I I I I I I I I I I I I I! I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 179 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 238 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ' 
Db 239 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 298 

Qy 121 ACAGGCATCGTGGATGAGTGCTGCTTCCGGAGCTGTGATCTAAGGAGGCTGGAGATGTAT 180 

I I I I I I I I I I I I I I I I I M I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 299 AC AGGTAT C GT GGAT GAGT GCTGCTT CC GGAGCT GT GAT CTAAG GAGGCT GGAGAT GT AT 358 



Qy 



181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 



Db 359 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 418 

Qy 241 ATGCCCAAGACC C AGAAGT AT C AG C C C C CAT CT AC CAACAAGAACAC GAAGT CT C AGAGA 300 

I I I I I I I I I I I I I I I 

Db 419 AT G C C C AAGAC C C AG 433 

Qy 301 AGGAAAGGAAGT ACATT T GAAGAACACAAGT AGAGGGAGT GC AG GAAACAAGAACT AC AG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 434 AAGGAAGT ACAT T T GAAGAAC GCAAGT AGAGGGAGTGCAGGAAACAAGAACT ACAG 489 

Qy 361 GATGT A- GAAGAC C CT T CT GAGGAGT GAAGAAGGACAGGCCACC GC AGGAC CCTTTGCTC 419 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 490 GAT GT AG GAAGAC C CT C CT GAGGAGT GAAGAGT GAC ATGC CAC C GCAGGAT CCTTTGCTC 54 9 

Qy 420 T GC AC - AGT TAC CT G - TAAACAT T GGAAT AC C GGCCAAAAAATAAGT TT GAT CAC AT T T C 477 

I I I I I I I I I I I I I I I I I I I Ml || I I I I I I II I I I I I I 

Db 550 T GCAC GAGTT AC CT GTTAAACTT T GGAACAC CT ACCAAAAAATAAGT T T GATAAC ATTT A 609 

Qy 478 AAAGAT - GGC AT TT C C C C CAAT GAAAT ACACAAGTAAACAT 517 

I I I I I I III I II I I I I I I I I I I II I I I I I I I I I I II I I I 
Db 610 AAAGAT GGGC GT T T C C C C CAAT GAAAT ACACAAGTAAACAT 650 



RESULT 4 
5405942-2 

/Patent No. 5405942 

APPLICANT: BELL, GRAEME I . ; RALL, LESLIE B. ;MERRYWEATHER, 
; JAMES P. 

; TITLE OF INVENTION: PREPRO INSULIN-LIKE GROWTH FACTORS 

;I AND II 

NUMBER OF SEQUENCES: 16 
CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/07/65,673 

; FILING DATE: 16-JUN-1987 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 630,557 
FILING DATE: 19-JUL-1984 
;SEQ ID NO: 2: 

LENGTH: 622 
5405942-2 

Query Match 65.6%; Score 339.4; DB 6; Length 622; 

Best Local Similarity 69.7%; Pred. No. 3.9e-95; 

Matches 363; Conservative 89; Mismatches 16; Indels 53; Gaps 5; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I II I I : I : I I I I I I I : I I I I : I I : I I I : I I : I :: I I I :: I I : I : | : | | | | | | 
Db 45 GGACCGGAGACGCUCUGCGGGGCUGAGCUGGUGGAUGCUCUUCAGUUCGUGUGUGGAGAC 104 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I h - I I I I I I I I I I I I I I II : I : II I : I I I I I I I : I I I I I I I I I I I : I I I 

Db 105 AGGGGCUUUUAUUUCAACAAGCCCACAGGGUAUGGCUCCAGCAGUCGGAGGGCGCCUCAG 164 

Qy 121 ACAGGC AT C GT GGAT GAGT GCT GCTT C C GGAGCT GT GAT CTAAGGAGGCT GGAGAT GT AT 180 

I I I I I I : I I : I I I : II I : I I : I I : : I I I I M I : I : I I : I I I I I I I I : I I I I I : I : I : 
Db 165 ACAG GUAU C GU GGAU GAGU GCUGCUU C C GGAGCUGU GAU CAU AGGAGGCU GGAGAU GUAU 224 



Qy 
Db 



181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

: I I I I I I I I I : I I I I I I : I I I I I I : I I I I : I II : I : I : I I I : I I I I I I I I I I I I I I I I I I 
225 UGCGCACCCCUCAAGCCUGCCAAGUCAGCUCGCUCUGUCCGUGCCCAGCGCCACACCGAC 284 



Qy 



241 ATGCCCAAGACCCAGAAGTATCAGCCCCCATCTACCAACAAGAACACGAAGTCTCAGAGA 300 
I : I I I I I I I I I I I I I 

285 AUGCCCAAGACCCAG 299 



Db 



Qy 



301 AGGAAAGGAAGTACATTTGAAGAACACAAGTAGAGGGAGTGCAGGAAACAAGAACTACAG 360 



i i i i i i i - • II • I II M II M II II II * II II 

300 AAGGAAGUACAUUUGAAGAACGCAAGUAGAGGGAGUGCAGGAAACAAGAACUACAG 355 





Db 



Qy 



361 GAT GTA- GAAGAC C CT T CT GAGGAGT GAAGAAGGAC AGGC CAC C GC AGGAC CCTTTGCTC 419 



Db 




Qy 



420 T GCAC - AGTTAC CT G- TAAACAT T G GAAT AC C GGC CAAAAAATAAGT T T GAT C ACAT T T C 477 



•llll l l • • I I I • I • I I I l "•llll Ill I I I I I I I I • I I I • • • I I • |||... 

416 UGCACGAGUUAC CU GUUAAACUUU GGAACAC CU AC CAAAAAAUAAGUUUGAUAACAUUU A 475 




Db 



Qy 



478 AAAGAT - GGCAT T T C C C C CAAT GAAAT AC AC AAGT AAAC AT 517 



Db 




RESULT 5 

US-08-472-809B-8 

; Sequence 8, Application US/08472809B 
; Patent No. 5925564 
; GENERAL INFORMATION: 

APPLICANT: Schwartz, Robert J. 
APPLICANT: DeMayo, Franco J. 
APPLICANT: O'Malley, Bert W. 
; TITLE OF INVENTION: Expression Vector Systems and 
; TITLE OF INVENTION: Method of Use 
NUMBER OF SEQUENCES: 8 
CORRESPONDENCE ADDRESS: 
ADDRESSEE: Lyon & Lyon 
STREET: 633 West Fifth Street 
STREET: Suite 4700 
CITY: Los Angeles 
STATE: California 
; COUNTRY: U.S.A. 

ZIP : 90071-2066 
COMPUTER READABLE FORM: 

MEDIUM TYPE: 3.5" Diskette, 1.44 Mb 
; MEDIUM TYPE: storage 

COMPUTER: IBM Compatible 
OPERATING SYSTEM: IBM P.C. DOS 5.0 
SOFTWARE: Word Perfect 5.1 
CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/08/472 , 809B 

FILING DATE: June 7 , 1995 
CLASSIFICATION: 435 
; PRIOR APPLICATION DATA: 

; APPLICATION NUMBER: 08/209,846 



FILING DATE: March 9, 1994 
APPLICATION NUMBER: 07/789,919 
FILING DATE: No. 5925564ember 6, 1991 
ATTORNEY/ AGENT INFORMATION: 
NAME: Warburg, Richard J. 
REGISTRATION NUMBER: 32,327 
REFERENCE/ DOCKET NUMBER: 214/212 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (213) 4 8 9-1600 
TELEFAX: (213) 955-044 0 
TELEX: 67-3510 
INFORMATION FOR SEQ ID NO: 8: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 5707 bases 
TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
US-08-472-809B-8 

Query Match 55.4%; Score 286.4; DB 2; Length 5707; 

Best Local Similarity 85.6%; Pred. No. 2.3e-78; 

Matches 363; Conservative 0; Mismatches 11; Indels 50; Gaps 2; 

GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II | | | | 

GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 852 

AGGGGCTTTTATTTC7VACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 12 0 
I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I || 
AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 912 

ACAGGCAT CGTGGAT GAGTGCT GCTTCCGGAGCT GT GAT CTAAGGAGGCT GGAGATGTAT 180 
I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | 
AC AG GCAT C GT GGATGAGT GCTGCTT CCGGAGCT GT GAT CTAAGGAGGCT GGAGAT GT AT 972 

TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | 

TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 1032 

AT GCCCAAGACCCAGAAGTATCAGCCCCCATCTACCAACAAGAACACGAAGT CT CAGAGA 300 
I I II I I I I I I I I I I I 

ATGCCCAAGACCCAG 1047 

AGGAAAGGAAGTACATTTGAAGAACACAAGTAGAGGGAGTGCAGGAAACAAGAACTACAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 
AAGGAAGTACATTTGAAGAACGCAAGTAGAGGGAGTGCAGGAAACAAGAACTACAG 1103 

GAT GT A- GAAGACCCT T CT GAGGAGT GAAGAAGGACAGGC C AC C GC AGGAC CCTTTGCTC 419 
I I I I M M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
GATGTAGGAAGACCCT CCTGAGGAGT GAAGAGT GACATGCCACC GCAGGATC CCCCGGGC 1163 

TGCA 423 
I I I I 



Qy 


1 


Db 


793 


Qy 


61 


Db 


853 


Qy 


121 


Db 


913 


Qy 


181 


Db 


973 


Qy 


241 


Db 


1033 


Qy 


301 


Db 


1048 


Qy 


361 


Db 


1104 


Qy 


420 


Db 


1164 



RESULT 6 

US-08-472-809B-7 

; Sequence 7, Application US/08472809B 
; Patent No. 5925564 
; GENERAL INFORMATION: 

APPLICANT: Schwartz, Robert J. 
; APPLICANT: DeMayo, Franco J. 

APPLICANT: O'Malley, Bert W. 

TITLE OF INVENTION: Expression Vector Systems and 
TITLE OF INVENTION: Method of Use 
NUMBER OF SEQUENCES: 8 
CORRESPONDENCE ADDRESS: 
ADDRESSEE: Lyon & Lyon 
STREET: 633 West Fifth Street 
STREET: Suite 4700 
; CITY: Los Angeles 

; STATE: California 

COUNTRY: U.S.A. 
ZIP: 90071-2066 
COMPUTER READABLE FORM: 
; MEDIUM TYPE: 3.5" Diskette, 1.44 Mb 

MEDIUM TYPE: storage 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: IBM P.C. DOS 5.0 
SOFTWARE: Word Perfect 5.1 
; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/472 , 809B 
FILING DATE: June 7, 1995 
; CLASSIFICATION: 435 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/209,846 
FILING DATE: March 9, 1994 
APPLICATION NUMBER: 07/789,919 
; FILING DATE: No. 5925564ember 6, 1991 

ATTORNEY/ AGENT INFORMATION: 
NAME: Warburg, Richard J. 
REGISTRATION NUMBER: 32,327 
; REFERENCE/ DOCKET NUMBER: 214/212 

TELECOMMUNICATION INFORMATION: 
TELEPHONE: (213) 489-1600 
TELEFAX: (213) 955-0440 
; TELEX: 67-3510 

; INFORMATION FOR SEQ ID NO: 7: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 6345 bases 
TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
US-08-472-809B-7 

Query Match 55.4%; Score 286.4; DB 2; Length 6345; 

Best Local Similarity 85.6%; Pred. No. 2.4e-78; 

Matches 363; Conservative 0; Mismatches 11; Indels 50; Gaps 2; 



Qy 



1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 
I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M | | | | | | | | M 



Db 3702 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 3761 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 3762 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 3821 

Qy 121 ACAGG CAT C GT GGAT GAGT GCT GCT T C CGGAGCT GT GAT CTAAGGAGG CT GGAGAT GT AT 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 3822 ACAGGCAT CGT GGAT GAGT GCT GCT T C CGGAGCT GTGATCTAAGGAGGCT GGAGAT GTAT 3881 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

I I I > I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 38 82 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 3941 

Qy 241 AT GC C CAAGAC CCAGAAGT AT CAGC C C C C AT CT AC CAACAAGAACAC GAAGT CT CAGAGA 300 

I I I I I I I I I I II I I I 

Db 3942 ATGCCCAAGACCCAG 3956 

Qy 301 AGGAAAGGAAGTAC AT TT GAAGAAC ACAAGT AGAGGGAGT GCAGGAAACAAGAACT ACAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | 
Db 3957 AAGGAAGT ACATT T GAAGAAC GCAAGT AGAGGGAGT GCAGGAAACAAGAACT ACAG 4012 

Qy 361 GAT GTA- GAAGACCCTTCT GAGGAGTGAAGAAGGACAGGCCACCGCAGGACCCTTT GCTC 419 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I II I I 
Db 4013 GAT GT AGGAAGAC C CT C CT GAGGAGTGAAGAGT GACAT G C CACC G C AGGAT CC C C C GGGC 4 072 

Qy 420 TGCA 423 

I I I I 

Db 4073 TGCA 4076 



RESULT 7 

5405942-13 

/Patent No. 5405942 

APPLICANT: BELL, GRAEME I . ; RALL, LESLIE B. ;MERRYWEATHER, 
; JAMES P. 

TITLE OF INVENTION: PREPRO INSULIN-LIKE GROWTH FACTORS 
;I AND II 

; NUMBER OF SEQUENCES: 16 
; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/07/65,673 

FILING DATE: 16-JUN-1987 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 630,557 
; FILING DATE: 19-JUL-1984 

;SEQ ID NO: 13: 
; LENGTH: 357 

5405942-13 

Query Match 49,4%; Score 255.2; DB 6; Length 357; 

Best Local Similarity 98.8%; Pred. No. 2.8e-69; 

Matches 257; Conservative 0; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I 
Db 43 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 102 



Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 103 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 162 

Qy 121 ACAGGCATCGTGGATGAGTGCTGCTTCCGGAGCTGTGATCTAAGGAGGCTGGAGATGTAT 180 

I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 163 ACAGGTATCGTGGATGAGTGCTGCTTCCGGAGCTGTGATCTAAGGAGGCTGGAGATGTAT 222 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 223 TGCGCACCCCTCAGGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 282 

Qy 241 AT GC C CAAGAC C C AGAAGT A 2 60 

I I I I I I I II I II I I I I II I 

Db 283 AT G C C CAAGAC C C AGAAG GA 302 



RESULT 8 
5405942-9 
Patent No. 5405942 

APPLICANT: BELL, GRAEME I . ; RALL, LESLIE B. ;MERRYWEATHER, 
JAMES P. 

TITLE OF INVENTION: PREPRO INSULIN-LIKE GROWTH FACTORS 
I AND II 

NUMBER OF SEQUENCES: 16 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/07/65,673 
FILING DATE: 16-JUN-1987 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 630,557 
FILING DATE: 19-JUL-1984 
SEQ ID NO: 9: 

LENGTH: 357 
5405942-9 

Query Match 49.1%; Score 253.6; DB 6; Length 357; 

Best Local Similarity 79.2%; Pred. No. 8.8e-69; 

Matches 206; Conservative 50; Mismatches 4; Indels 0; Gaps 0; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I I I I : I : I M I I I I : I I I I : I I : I I I I I : I : : I I I : : I I : I : I : I I I I I I 
Db 43 GGACCGGAGACGCUCUGCGGGGCUGAGCUGGUGGACGCUCUUCAGUUCGUGUGUGGAGAC 102 

Qy 61 AGGGGCTTTTATTTCT^ACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 12 0 

I I I I I I : : : : I : : : I I I I I I I II I II I I I I : I : I I I : I I I I M I : I I I I I I I I I I I : I I I 
Db 103 AGGGGCUUUUAUUUCAACAAGC C CACAGGGU AU GG CU C CAGC AGU C GGAGGGC GC CU CAG 162 

Qy 121 ACAGGCATCGTGGAT GAGT GCT GCTT C C GGAGCT GT GAT CTAAGGAGGCT GGAGAT GT AT 18 0 

I I I I I I : I I : I I I : I I I : I I : I : : I I I I I I I : I : I I : I : I I I I I I I I : I I I I I : I : I : 
Db 163 ACAGGUAUCGUGGAUGAGU GCUGUUUC C GGAG CUGU GAUCUAAG GAGGCU GGAGAU GU AU 222 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

: I I I I I I I I I : I I I I I I : I I I I I I : I I I I : I II : I : I : I I I : I I I I I I I I I II I II II I I 
Db 223 UGCGCACCCCUCAAGCCUGCCAAGUCAGCUCGCUCUGUCCGUGCCCAGCGCCACACCGAC 282 

Qy 241 AT GCCCAAGACCCAGAAGTA 260 

I : I I I I I I I I I I I I I I I I I 



Db 283 AU G C C C AAG AC C C AGAAGGA 302 



RESULT 9 
5405942-7 

/Patent No. 5405942 

APPLICANT: BELL, GRAEME I . ; RALL, LESLIE B. ;MERRYWEATHER, 
; JAMES P. 

TITLE OF INVENTION: PREPRO INSULIN-LIKE GROWTH FACTORS 
;I AND II 

; NUMBER OF SEQUENCES : 16 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/07/65,673 

FILING DATE: 16-JUN-1987 
PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: 630,557 

FILING DATE: 19-JUL-1984 
;SEQ ID NO: 7: 

LENGTH: 210 
5405942-7 

Query Match 40.3%; Score 208.4; DB 6; Length 210; 

Best Local Similarity 77.6%; Pred. No. 6e-55; 

Matches 163; Conservative 46; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I I I I : I : I I I I I I I : I I I I : I I : I I I : I I : I :: I I I :: I I : I : I : I I I I I I 
Db 1 GGACCGGAGACGCUCUGCGGGGCUGAGCUGGUGGAUGCUCUUCAGUUCGUGUGUGGAGAC 60 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I : : : : I : : : I M I I M I I I I I I I I I : I : I I I : I M | | | | : | | | [| I I i I II : ! I 
Db 61 AGGGGCUUUUAUUUCAACAAGCCCACAGGGUAUGGCUCCAGCAGUCGGAGGGCGCCUCAG 120 

Qy 121 ACAGGCATCGT GGATGAGT GCT GCT T C C GGAGCT GT GAT CTAAGGAGGCT GGAGAT GT AT 180 

I I I I I I : I I : I I I : I I I : I I : I I : : I I I I I I I : I : I I : I : I I I I I I I I : I I I I I : I : I : 
Db 121 ACAGGUAUCGUGGAUGAGUGCUGCUUC CGGAGCU GU GAU CUAAGGAGGCU GGAGAUGUAU 18 0 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCT 210 

: I I I I I I I I I : I I I I I I : I I I I I I : I I I I : 
Db 181 UGCGCACCCCUCAAGCCUGCCAAGUCAGCU 210 



RESULT 10 

5405942-11 

;Patent No. 5405942 

APPLICANT: BELL, GRAEME I.; RALL, LESLIE B. ;MERRYWEATHER, 
; JAMES P. 

; TITLE OF INVENTION: PREPRO INSULIN-LIKE GROWTH FACTORS 
;I AND II 

NUMBER OF SEQUENCES: 16 

CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/07/65,673 

FILING DATE: 16-JUN-1987 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 630,557 
; FILING DATE: 19-JUL-1984 

;SEQ ID NO: 11: 



LENGTH: 210 
5405942-11 

Query Match 40.3%; Score 208.4; DB 6; Length 210; 

Best Local Similarity 99.5%; Pred. No. 6e-55; 

Matches 209; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I II I I I I I I I I I I 
Db 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 12 0 

Qy 121 ACAGG CAT C GT GGAT GAGT GCTGCTTC C GGAGCT GT GAT CTAAGGAG GCT GGAGAT GT AT 180 

I I I M I I I I I I I I I I I I I I I I | I | | | | | | | | | | | | M | I I I I I I I II I I I I I II II I I I 
Db 121 ACAGGTATCGT GGAT GAGT GCT GCTT CC GGAGCT GT GAT CTAAGGAGGCT GGAGAT GT AT 180 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCT 210 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCT 210 



RESULT 11 
US-09-255-829-13 

; Sequence 13, Application US/09255829 

; Patent No. 6461617 

; GENERAL INFORMATION: 

APPLICANT: Shone, Clifford Charles 
; APPLICANT: Quinn, Conrad Padraig 
; APPLICANT: Foster, Keith Alan 

TITLE OF INVENTION: Recombinant Toxin Fragments 
NUMBER OF SEQUENCES: 29 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: STERNE, KESSLER, GOLDSTEIN, & FOX P.L.L.C. 

; STREET: 1100 NEW YORK AVENUE, NW, SUITE 600 

CITY: WASHINGTON 
; STATE: DC 

COUNTRY: USA 
ZIP: 20005-3934 
; COMPUTER READABLE FORM: 

; MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 (EPO) 
CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/09/255, 829 

FILING DATE: 23-FEB-1999 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: PCT/GB97/02273 
; FILING DATE: 22-AUG-1997 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/782,893 
FILING DATE: 27-DEC-1996 
ATTORNEY/AGENT INFORMATION: 
; NAME: ESMOND, ROBERT W. 



; REGISTRATION NUMBER: 32,893 

REFERENCE/DOCKET NUMBER: 1581.0130002 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 202-371-2600 
; TELEFAX: 202-371-2540 

INFORMATION FOR SEQ ID NO: 13: 
SEQUENCE CHARACTERISTICS: 
LENGTH : 2 8 62 base pairs 
; TYPE: nucleic acid 

STRANDEDNESS : double 
TOPOLOGY: linear 
MOLECULE TYPE: DNA (genomic) 
FEATURE: 

NAME/KEY: CDS 
LOCATION: 1..2862 
US-09-255-829-13 

Query Match 4 0.3%; Score 208.4; DB 4; Length 2862; 

Best Local Similarity 99.5%; Pred. No. 1.9e-54; 

Matches 209; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 2644 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 2703 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2704 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 2763 

Qy 121 AC AGGC AT C GT GGAT GAGT GCT GCTT CCGGAGCT GT GAT CTAAGGAGGCTGGAGAT GT AT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 2764 ACAGGTAT CGT GGAT GAGT GCT GCTTCCGGAGCTGT GAT CTAAGGAGGCTGGAGAT GT AT 2823 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCT 210 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 
Db 2824 TGCGCACCCCTCAAGCCTGCCAAGTCAGCT 2853 



RESULT 12 

5405942-15 

; Patent No. 5405942 

APPLICANT: BELL, GRAEME I . ; RALL, LESLIE B. ;MERRYWEATHER, 
; JAMES P. 

TITLE OF INVENTION: PREPRO INSULIN-LIKE GROWTH FACTORS 
;I AND II 

; NUMBER OF SEQUENCES: 16 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/07/65,673 
FILING DATE: 16-JUN-1987 
PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: 630,557 

FILING DATE: 19-JUL-1984 
;SEQ ID NO: 15: 

LENGTH: 210 
5405942-15 



Query Match 



40.0%; Score 206.8; DB 6; Length 210; 



Best Local Similarity 77.1%; Pred. No. 1.9e-54; 

Matches 162; Conservative 46; Mismatches 2; Indels 0; Gaps 0; 



Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I II I : I : I I I I I I I : I I I I : I I : I I I : I I : I :: I I I :: I I : I : I : I I I I I I 
Db 1 GGACCGGAGACGCUCUGCGGGGCUGAGCUGGUGGAUGCUCUUCAGUUCGUGUGUGGAGAC 60 

Qy 61 AGGGGCTTTTATTTCAAC7UVGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 12 0 

I I I I M :::: I ::: I I I I I I I I I I I I I I I I : I : I I I : I I I I I I I : I I I I II I I I I I : I I I 

Db 61 AGGGGCUUUUAUUUCAACAAGCCCACAGGGUAUGGCUCCAGCAGUCGGAGGGCGCCUCAG 120 

Qy 121 ACAGGCAT CGT GGAT GAGT GCT GCTT CCGGAGCTGTGAT CTAAGGAGGCTGGAGATGT AT 18 0 

I I I I I I : I I : I I I : I I I : I I : II : : I I I I I I I : I : I I : I : I I I I I I I I : I II I I : I : I : 
Db 121 AC AG GUAU C GU GGAU GAGU GCU GCUUC C GGAGCUGU GAUCUAAGGAGGCUGGAGAUGUAU 180 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCT 210 

: I I I I I I I I I : I I I I I : I I I I I I : I I I I : 
Db 181 UGCGCACCCCUCAGGCCUGCCAAGUCAGCU 210 



RESULT 13 
US-08-308-196A-1 

Sequence 1, Application US/08308196A 
Patent No. 5612198 
GENERAL INFORMATION: 

APPLICANT: Brierley, Russell A. 
APPLICANT: Davis , Geneva R. 
APPLICANT: Holtz, Gregory C. 
APPLICANT: Gleeson, Martin A. 
APPLICANT: Howard, Bradley D. 

TITLE OF INVENTION: Production of Insulin-Like Growth 
TITLE OF INVENTION: Factor-1 in Methylotrophic Yeast Cells 
NUMBER OF SEQUENCES: 17 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Brown, Martin, Haller & McClain 
STREET: 1660 Union Street 
CITY: San Diego 
STATE: California 
COUNTRY: USA 
ZIP: 92101-2926 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/308, 196A 
FILING DATE: 09-SEPT-1994 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/983,523 
FILING DATE: 03-MAR-1993 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/578,728 
FILING DATE: 04-SEP-1990 
ATTORNEY/AGENT INFORMATION: 
NAME: Seidman, Stephanie L. 
REGISTRATION NUMBER: 33,779 



REFERENCE/ DOCKET NUMBER: 51875 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (619)238-0999 
; TELEFAX: (619)2 38-0062 

; INFORMATION FOR SEQ ID NO: 1: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 240 base pairs 

; TYPE: nucleic acid 

STRANDEDNESS: double 
; TOPOLOGY: unknown 

MOLECULE TYPE: cDNA 
FEATURE : 

NAME/ KEY: CDS 
LOCATION: 14.. 232 
US-08-308-196A-1 

Query Match 39.2%; Score 202.8; DB 1; Length 24 0; 

Best Local Similarity 96.7%; Pred. No. 3.4e-53; 

Matches 207; Conservative 0; Mismatches 7; Indels 0; Gaps 0; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I | I I I I I I | | | | | | I I I I I M I I I I I I I | I | | 
Db 17 GGACCGGAGACGCTCTGCGGGGCTGAGCTCGTGGATGCTCTGCAGTTCGTGTGTGGAGAC 76 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 77 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGACGGGCGCCTCAG 136 

Qy 121 ACAGGCATCGTGGATGAGTGCT GCTTCCGGAGCT GT GAT CTAAGGAGGCTGGAGAT GTAT 18 0 

I I > I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I 
Db 137 ACAGGCAT C GTGGAT GAGTGCT GCTTCCGGAGCT GT GAT CTAAGGAGGCT CGAGATGT AT 196 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCT 214 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 197 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTTGAT 230 



RESULT 14 
PCT-US91-06452-1 

Sequence 1, Application PC/TUS9106452 
GENERAL INFORMATION: 

APPLICANT: Brierley, Russell A. 

Davis , Geneva R. 
Holtz, Gregory C. 
Gleeson, Martin A. 
Bradley, 



APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
TITLE OF INVENTION: 
TITLE OF INVENTION: 
NUMBER OF SEQUENCES 



D. H. 

Production of Insulin-Like Growth 
Factor-1 in Methylotrophic Yeast Cells 
12 

CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fitch, Even, Tabin & Flannery 

STREET: 135 South LaSalle Street, Suite 900 

CITY: Chicago 

STATE: Illinois 

COUNTRY: USA 

ZIP: 60603 
COMPUTER READABLE FORM: 



MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: PCT/US91/ 06452 
FILING DATE: 19910409 
CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: US 07/578,728 

; FILING DATE: 04-SEP-1990 

ATTORNEY/AGENT INFORMATION: 
; NAME: Seidman, Stephanie L. 

REGISTRATION NUMBER: 33,779 
REFERENCE/ DOCKET NUMBER: 51874 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (619)552-1311 
TELEFAX: (619)552-0095 
INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 24 0 base pairs 
TYPE: NUCLEIC ACID 
STRANDEDNESS: double 
TOPOLOGY: unknown 
MOLECULE TYPE: cDNA 
; FEATURE : 

; NAME/KEY: CDS 

LOCATION: 14.. 232 
PCT-US91-06452-1 

Query Match 39.2%; Score 202.8; DB 5; Length 240; 

Best Local Similarity 96.7%; Pred. No. 3.4e-53; 

Matches 207; Conservative 0; Mismatches 7; Indels 0; Gaps 0; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 17 GGACCGGAGACGCTCTGCGGGGCTGAGCTCGTGGATGCTCTGCAGTTCGTGTGTGGAGAC 76 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 77 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGACGGGCGCCTCAG 136 

Qy 121 ACAGGCAT CGTGGATGAGT GCT GCTTCCGGAGCT GTGAT CTAAGGAGGCTGGAGATGTAT 18 0 

I I I I I I I I I I I II I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | | | | M | | | | | I I I I I 
Db 137 AC AGG C AT C GT GGAT GAGT GCTGCTTC C GGAGCT GT GAT CTAAGGAGGCT C GAGAT GT AT 196 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCT 214 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 197 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTTGAT 230 



RESULT 15 
US-09-029-267-13 

; Sequence 13, Application US/09029267 
; Patent No. 6107057 
; GENERAL INFORMATION: 

APPLICANT: Crawford, Kenneth 



; APPLICANT: Zaror, Isabel 
APPLICANT: Innis, Michael 

TITLE OF INVENTION: Pichia Secretory Leader for Protein 
; TITLE OF INVENTION: Expression 
NUMBER OF SEQUENCES : 4 0 
CORRESPONDENCE ADDRESS : 
; ADDRESSEE: Chiron Corporation 

; STREET: 4560 Horton Street 

; CITY: Emeryville 

STATE: California 
COUNTRY: United States 
ZIP: 94608 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
; OPERATING SYSTEM: PC-DOS/MS-DOS 

; SOFTWARE: Patentln Release #1.0, Version #1.30 

; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 09/ 02 9, 2 67 

FILING DATE: 

CLASSIFICATION: 
; ATTORNEY/ AGENT INFORMATION: 
; NAME: Chung, Ling-Fong 

REGISTRATION NUMBER: 36,482 

REFERENCE/ DOCKET NUMBER: 1165.100 
; TELECOMMUNICATION INFORMATION: 

; TELEPHONE: (510) 601-2704 

TELEFAX: (510) 655-3542 
; INFORMATION FOR SEQ ID NO: 13: 
; SEQUENCE CHARACTERISTICS: 

LENGTH: 390 base pairs 

TYPE: nucleic acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
MOLECULE TYPE: other nucleic acid 
; DESCRIPTION: /desc = "Synthetic" 

US-09-029-267-13 

Query Match 39.2%; Score 202.8; DB 3; Length 390; 

Best Local Similarity 96.7%; Pred. No. 4.2e-53; 

Matches 207; Conservative 0; Mismatches 7; Indels 0; Gaps 0; 
Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 



Db 



160 




219 



QY 



61 



AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 



120 



Db 



220 




279 



Qy 



121 



ACAGGCATCGTGGAT GAGT GCT GCTTCCGGAGCT GTGAT CTAAGGAGGCT GGAGATGTAT 



180 



Db 



280 




339 



181 



TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCT 214 




Db 



340 



T GC GCAC CCCT CAAGC CT GC CAAGT CAG CT T GAT 373 



Search completed: December 13, 2003, 11:44:49 
Job time : 49.8 037 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2003 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 



Run on : 



Title: 

Perfect score: 
Sequence: 

Scoring table: 



December 13, 2003, 07:29:55 ; Search time 230.833 Seconds 

(without alignments) 
7443.919 Million cell updates/sec 

US-09-852-261-1 
517 

1 ggaccggagacgctctgcgg tgaaatacacaagtaaacat 517 

IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 



Searched: 2201672 seqs, 1661799599 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



4403344 



Post-processing: 



Database 



Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 

Published_Applications_NA: * 



1 
2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 
15 
16 
17 
18 



/cgn2_6/ptodata/2/pubpna/US07_PUBCOMB.seq: * 
/cgn2_6/ptodata/2/pubpna/PCT_NEW_PUB. seq: * 
/cgn2_6/ptodata/2/pubpna/US06_NEW_PUB.seq: * 
/ cgn2_6/ptodata/ 2 /pubpna/US 0 6_PUBC0MB .seq:* 
/cgn2_6/ptodata/2/pubpna/US07_NEW__PUB.seq: * 
/cgn2_6/ptodata/2/pubpna/PCTUS_PUBCOMB.seq: 
/cgn2_6/ptodata/2/pubpna/US08__NEW_PUB.seq: * 
/cgn2_6/ptodata/2/pubpna/US08_PUBCOMB.seq: * 
/ cgn2_6/ptodata/2 /pubpna/US 0 9A_PUBCOMB .seq: 
/cgn2_6/ptodata/2/pubpna/US09B_PUBCOMB. seq 
/cgn2_6/ptodata/2/pubpna/US09C_PUBCOMB. seq 
/cgn2_6/ptodata/2/pubpna/US09_NEW_PUB. seq: 
/cgn2_6/ptodata/2/pubpna/US09_NEW_PUB. seq2 
/cgn2_6/ptodata/2/pubpna/US10A_PUBCOMB. seq 
/cgn2_6/ptodata/2/pubpna/US10B_PUBCOMB. seq 
/cgn2_6/ptodata/2/pubpna/US10_NEW_PUB.seq: 
/cgn2_6/ptodata/2/pubpna/US60_NEW_PUB. seq: 
/cgn2_6/ptodata/2/pubpna/US60_PUBCOMB.seq: 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
US-09-852-261-1 

; Sequence 1, Application US/09852261 

; Patent No. US20020083477A1 

; GENERAL INFORMATION: 

; APPLICANT: GOLDSPINK, GEOFFREY 



; APPLICANT: TERENGHI, GIORGIO 

; TITLE OF INVENTION: REPAIR OF NERVE DAMAGE 

; FILE REFERENCE: 117-351 

; CURRENT APPLICATION NUMBER: US/09/852,261 

; CURRENT FILING DATE: 2001-05-10 

; PRIOR APPLICATION NUMBER: GB 0011278.9 

; PRIOR FILING DATE: 2000-05-10 

; NUMBER OF SEQ ID NOS : 14 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 1 

LENGTH: 517 
; TYPE: DNA 
; ORGANISM: Homo sapiens 
US-09-852-261-1 

Query Match 100.0%; Score 517; DB 9; Length 517; 

Best Local Similarity 100.0%; Pred. No. 2.4e-160; 

Matches 517; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 12 0 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I | I | | | | | | | | | 
Db 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 12 0 

Qy 121 ACAGGCATC GT GGAT GAGT GCT GCTT C C GGAGCT GT GAT CTAAGGAGGCT GGAGAT GT AT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 AC AGGCAT C GT GGAT GAGT GCTGCTTCC GGAGCT GT GAT CTAAGGAG GCTGGAGAT GT AT 18 0 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 24 0 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I II I I I I I I I I I I I I I M I I 
Db • 181 TGCGCACCCCTC7VAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 24 0 

Qy 241 ATGCCCAAGACCCAGAAGT AT CAGCCCCCATCT ACCAACAAGAACACGAAGTCT CAGAGA 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 AT GC CCAAGACC C AGAAGT AT C AGC CC CCAT CT AC CAACAAGAAC AC GAAGT CT CAGAGA 300 

Qy 301 AGGAAAGGAAGTAC ATT T GAAGAACACAAGT AGAGGGAGT GCAGGAAACAAGAACT ACAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 AGGAAAGGAAGT ACAT TT GAAGAACACAAGT AGAGG GAGT GCAGGAAACAAGAACT ACAG 360 

Qy 361 GAT GT AGAAGAC C CTT CT GAGGAGT GAAGAAGGACAGGC C AC C G C AGGACC CT TT GCT CT 420 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 

Db 361 GAT GTAGAAGAC C CT T CT GAGGAGT GAAGAAGGACAGGC C AC CGC AGGACC CT T T GCT CT 420 

Qy 421 GCACAGT T AC CT GT AAAC AT T GGAAT ACC GGC CAAAAAAT AAGT T T GAT C ACAT T T CAAA 4 8 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

Db 421 GCACAGT T AC CT GTAAAC AT T GGAAT ACC GG C CAAAAAAT AAGT TT GAT C ACATTT CAAA 480 

Qy 481 GATGGCATTT CCCCCAAT GAAATACACAAGTAAACAT 517 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 
Db 481 GAT G GCAT T T C C C C C AAT GAAATACACAAGTAAACAT 517 



RESULT 2 



US-09-852-261-5 

; Sequence 5, Application US/09852261 

; Patent No. US20020083477A1 

; GENERAL INFORMATION: 

; APPLICANT: GOLDSPINK, GEOFFREY 

; APPLICANT: TERENGHI, GIORGIO 

; TITLE OF INVENTION: REPAIR OF NERVE DAMAGE 

; FILE REFERENCE: 117-351 

; CURRENT APPLICATION NUMBER: US/ 09/ 852 , 2 61 

; CURRENT FILING DATE: 2001-05-10 

; PRIOR APPLICATION NUMBER: GB 0011278.9 

; PRIOR FILING DATE: 2000-05-10 

; NUMBER OF SEQ ID NOS : 14 

; SOFTWARE: PatentlnVer. 2.1 

; SEQ ID NO 5 

LENGTH: 523 
; TYPE: DNA 

; ORGANISM: Oryctolagus cuniculus 
US-09-852-261-5 

Query Match 90.4%; Score 467.4; DB 9; Length 523; 

Best Local Similarity 96.2%; Pred. No. 6.1e-144; 

Matches 501; Conservative 0; Mismatches 16; Indels 4; Gaps 2; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 
Db 1 GGACCGGAGACGCTCTGCGGTGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

Qy 61 AGGGGCTTTT AT T T CAACAAG C C C ACAGGGT AT GGCT C CAGCAGT C G GAGGGCGC CT CAG 12 0 

I I II I I I I I M I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 AGGGGCTTTTATTTCAACAAGCCCACAGGATACGGCTCCAGCAGTCGGAGGGCACCTCAG 12 0 

Qy 121 ACAGGCATCGT GGAT GAGTGCT GCTTCCGGAGCTGT GATCTAAGGAGGCTGGAGAT GT AT 180 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 AC AGGC AT CGT GGAT GAGT GCT GCT T C CGGAGCT GT GAT CT GAG GAGGCT GGAGAT GT AC 18 0 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 24 0 

II I I I I I I I I I I I I I I II III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 TGTGCACCCCTCAAGCCGGCAAAGGCAGCCCGCTCCGTCCGTGCCCAGCGCCACACCGAC 24 0 

Qy 241 AT GCCCAAGACCCAGAAGTAT CAGCCCCCAT CTACCAACAAGAACACGAAGTCT CA G 297 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 AT GCCCAAGACTCAGAAGTAT CAGCCTCCAT CTACCAACAAGAAAAT GAAGTCT CAGAGG 300 

Qy 298 AGAAGGAAAGGAAGTACATTTGAAGAACACAAGTAGAGGGAGTGCAGGAAACAAGAACTA 357 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 AGAAGGAAAGGAAGTACATTTGAAGAACACAAGTAGAGGGAGTGCAGGAAACAAGAACTA 360 

Qy 358 C AGGAT GT A- GAAGAC C CTT CT GAGGAGTGAAGAAGGACAGGC CAC CGCAGGAC C CT TT G 416 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 C AGGAT GTAGGAAGAC C CT T CT GAGGAGT GAAGAAGGAC AGGC CAC C GCAGGAC C CTTT G 420 

Qy 417 CT CT GCACAGT T AC CT GT AAAC AT T GGAAT AC CG GC CAAAAAATAAGTTT GAT CAC ATT T 47 6 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 421 CT CT GCACAGT T AC CT GTAAACAT T GGAAT AC C GGC CAAAAAATAAGTTT GAT CAC AT TT 480 



Qy 



477 CAAAGATGGCATTT CCCCCAAT GAAATACACAAGTAAACAT 517 




Db 



4 81 CAAAGAT GGCATTTCCCCCAAT GAAATACACAAGTAAACAT 521 



RESULT 3 

US-09-852-261-13 

; Sequence 13, Application US/09852261 

; Patent No. US20020083477A1 

; GENERAL INFORMATION: 

; APPLICANT: GOLDSPINK, GEOFFREY 

; APPLICANT: TERENGHI, GIORGIO 

; TITLE OF INVENTION: REPAIR OF NERVE DAMAGE 

; FILE REFERENCE: 117-351 

; CURRENT APPLICATION NUMBER: US/09/852, 261 

; CURRENT FILING DATE: 2001-05-10 

; PRIOR APPLICATION NUMBER: GB 0011278.9 

; PRIOR FILING DATE: 2000-05-10 

; NUMBER OF SEQ ID NOS : 14 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 13 

; LENGTH: 471 

; TYPE: DNA 

; ORGANISM: Oryctolagus cuniculus 
US-09-852-261-13 

Query Match 73.0%; Score 377.2; DB 9; Length 471; 

Best Local Similarity 87.8%; Pred. No. 4e-114; 

Matches 455; Conservative 0; Mismatches 13; Indels 50; Gaps 2; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 GGACCGGAGACGCTCTGCGGTGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 AGGGGCTTTTATTTCAACAAGCCCACAGGATACGGCTCCAGCAGTCGGAGGGCACCTCAG 120 

Qy 121 AC AGGCAT C GTGGAT GAGT GCT GCT TCCGGAGCT GT GAT CTAAGGAGGCT GGAGAT GTAT 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 ACAGG CAT C GTGGAT GAGT GCT GCT T CC GGAGCT GT GAT CT GAGGAGGCT GGAGAT GT AC 18 0 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 





Db 



181 TGTGCACCCCTCAAGCCGGCAAAGGCAGCCCGCTCCGTCCGTGCCCAGCGCCACACCGAC 24 0 



Qy 



Db 



241 AT GC CCAAGACCCAGAAGT ATCAGCCCCCAT CTACCAACAAGAACACGAAGTCTCAGAGA 300 
I I I II I I I I I I III 

241 AT GCCCAAGACTCAG 255 



301 AGGAAAGGAAGTACATTTGAAGAACACAAGTAGAGGGAGTGCAGGAAACAAGAACTACAG 360 




Db 



256 AAGGAAGTACATTTGAAGAACACAAGTAGAGGGAGTGCAGGAAACAAGAACTACAG 311 



Qy 

Db 



361 
312 



GAT GT A- GAAGAC C CT T CT GAGGAGTGAAGAAGGAC AGGC CAC C GCAGGAC CCTTTGCTC 419 

MINI I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I 

GAT GT AGGAAGAC C CT T CT GAG GAGTGAAGAAGGAC AGGCCAC C GCAGGAC CCTTTGCTC 371 



Qy 420 TGCACAGTT ACCT GTAAACATT GGAAT ACCGGCCAAAAAATAAGTTT GATCACATTT CAA 479 

I M I I I I I I I I I I I I I I I I I I I I I I I I I M M I I I I I I I I I I I I I I M I I I I I I I I I I I I 
Db 372 T GCACAGTTACCTGTAAACATT GGAATACCGGCCAAAAAATAAGTTT GAT CACATTT CAA 431 

Qy 480 AGAT GGCAT T T C C C C CAAT GAAATACACAAGT AAACAT 517 

I I I I I I I I I I I I I I I II I I I II M I II I I I II I I I I I I 
Db 432 AGAT G GCATT T C C CC CAAT GAAAT ACACAAGTAAACAT 4 69 



RESULT 4 

US-09-919-497-24 

; Sequence 24, Application US/09919497 

; Patent No. US20020106662A1 

; GENERAL INFORMATION: 

; APPLICANT: Mutter, George L. 

; TITLE OF INVENTION: PROGNOSTIC CLASSIFICATION OF ENDOMETRIAL CANCER 
; FILE REFERENCE: B0801/7225 

; CURRENT APPLICATION NUMBER: US/09/919,497 
; CURRENT FILING DATE: 2001-07-31 
; PRIOR APPLICATION NUMBER: US 60/221,735 
; PRIOR FILING DATE: 2000-07-31 
; NUMBER OF SEQ ID NOS : 100 
; SOFTWARE: Patentln version 3.0 
; SEQ ID NO 24 
; LENGTH: 72 60 
TYPE: DNA 

ORGANISM: Homo sapiens 
US-09-919-497-24 



Query Match 66.6%; Score 344.2; DB 10; Length 7260; 

Best Local Similarity 87.3%; Pred. No. 1.3e-102; 

Matches 455; Conservative 0; Mismatches 13; Indels 53; Gaps 



Qy 


l 


GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 


60 




I | | 1 1 1 1 M 1 II 1 II II 1 M 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M M 1 




Db 


311 


GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 


370 


Qy 


61 


AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 


120 




I 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 




Db 


371 


AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 


430 


Qy 


121 


ACAGGCAT CGTGGAT GAGT GCTGCTT CCGGAGCT GT GAT CT AAGGAGGCT GGAGAT GTAT 


180 




1 I I I I I I I II 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 M 1 1 1 1 1 1 1 1 M 1 1 M 1 1 1 1 




Db 


431 


AC AGGC ATCGT GGAT GAGT GCTGCTTCC GGAGCT GT GAT CTAAGGAGGCT GGAGAT GTAT 


490 


Qy 


181 


TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 


240 




I | | || M II 1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 M M 1 




Db 


491 


TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 


550 


Qy 


241 


AT GCCCAAGACCCAGAAGTATCAGCCCCCAT CTACCAACAAGAACACGAAGT CTCAGAGA 


300 


Db 


551 


1 1 1 1 1 1 1 1 II 1 1 1 1 1 


565 




Qy 


301 


AGGAAAGGAAGT ACAT T T GAAGAAC ACAAGT AGAGGGAGT GCAGGAAACAAGAACT ACAG 


360 




1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 




Db 


566 


AAGGAAGTACATTTGAAGAACGCAAGTAGAGGGAGTGCAGGAAACAAGAACTACAG 


621 



Qy 

Db 



361 
622 



GAT GT A- GAAGAC C CT T CT GAGGAGT GAAGAAG GACAGG C CACC GC AGGACCCT T T GCT C 419 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GAT GT AGGAAGAC C CT C CT GAGGAGT GAAGAGT GACAT GC CAC C G C AGGAT CCT T T GCT C 681 



Qy 420 T GCAC - AGTT AC CT G- T AAACATT G GAAT AC C GGC CAAAAAAT AAGT T T GATCACATTT C 477 

I I I I I I I I I I II I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I II I I I M I 
Db 682 T G CAC GAGTT AC CT GT TAAACTTT GGAAC AC CT AC CAAAAAAT AAGT T T GATAACATT T A 741 

Qy 478 AAAGAT - G GC AT T T C C C C C AAT GAAAT AC AC AAGT AAAC AT 517 

I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 742 AAAGAT GGGC GT TT C C C CCAAT GAAAT AC ACAAGTAAACAT 782 



RESULT 5 

US-09-880-107-3739 

; Sequence 3739, Application US/09880107 

; Patent No. US20020142981A1 

; GENERAL INFORMATION: 

; APPLICANT: Home, Darci T. 

; APPLICANT: Vockley, Joseph G. 

; APPLICANT: Scherf, Uwe 

; APPLICANT: Gene Logic, Inc. 

TITLE OF INVENTION: Gene Expression Profiles in Liver Cancer 

FILE REFERENCE: 44921-5028-WO 
; CURRENT APPLICATION NUMBER: US/09/880, 107 
; CURRENT FILING DATE: 2001-06-14 

PRIOR APPLICATION NUMBER: US 60/211,379 
; PRIOR FILING DATE: 2000-06-14 
; PRIOR APPLICATION NUMBER: US 60/237,054 

PRIOR FILING DATE: 2000-10-02 
; NUMBER OF SEQ ID NOS : 3950 

SOFTWARE: Patentln Ver . 2.1 
; SEQ ID NO 3739 
LENGTH: 72 60 
TYPE: DNA 

ORGANISM: Homo sapiens 
FEATURE : 

OTHER INFORMATION: Genbank Accession No. US20020142981A1 X57025 
US-09-880-107-3739 

Query Match 66.6%; Score 344.2; DB 10; Length 7260; 

Best Local Similarity 87.3%; Pred. No. 1.3e-102; 

Matches 455; Conservative 0; Mismatches 13; Indels 53; Gaps 5; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 311 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 370 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 371 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 430 

Qy 121 ACAGGCATCGTGGATGAGTGCTGCTTCCGGAGCTGTGATCTAAGGAGGCTGGAGATGTAT 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 431 ACAGGCAT CGT GGAT GAGT GCTGCTTC C GGAGCT GT GAT CT AAG GAGGCT GGAGAT GT AT 490 



Qy 



181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 24 0 



Db 491 T G C GC AC C C CT CAAGCCTGC CAAGT CAGCT C GCT CT GT C C GT GC C CAGCGC CACAC C GAC 550 

Qy 241 ATGCCCAAGACCCAGAAGTAT CAGCCCCCATCTACCAACAAGAACACGAAGT CT CAGAGA 30 0 

I I I I I I I I I I I I I I I 

Db 551 AT GC C C AAGAC C CAG 565 

Qy 301 AGGAAAG GAAGT ACAT T T GAAGAACACAAGT AGAGGGAGT GC AGGAAACAAGAACT AC AG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 566 AAGGAAGT ACAT T T GAAGAAC GCAAGT AGAGGGAGT G C AGGAAACAAGAACT ACAG 621 

Qy 361 GAT GT A- GAAGACCCT T CT GAGGAGT GAAGAAGGAC AGGCCAC C GCAGGACC CT T T GCT C 419 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I MM M I II M I I M I I I M M M I 

Db 622 GAT GT AGGAAGAC C CT CCT GAGGAGT GAAGAGT GAC AT GCC AC C GCAGGAT CCTT T GCT C 681 

Qy 420 T GCAC - AGT T AC CT G- TAAACATT GGAAT ACC GGC CAAAAAATAAGT T T GAT C ACAT T T C 477 

I II I I II II II II I II I II I I I II I III I I I I II II I I I M II M I II II I I 

Db 682 T GCAC GAGT TAC CT GTTAAACTT T GGAAC AC CT AC CAAAAAATAAGTT T GAT AAC ATT TA 741 

Qy 47 8 AAAGAT - GGC ATT T C C C CCAAT GAAAT ACACAAGTAAAC AT 517 

II I I M III I M I I M I II II I II I I M M II I M I II I 

Db 742 AAAGAT GGGC GT T T C C C CCAATGAAAT ACACAAGTAAAC AT 782 



RESULT 6 

US-09-873-319-707 

; Sequence 707, Application US/09873319A 

; Publication No. US20030134324A1 

; GENERAL INFORMATION: 

; APPLICANT: Munger, William E. 

; APPLICANT: Kulkarni, Prakash 

; APPLICANT: Getzenberg, Robert H. 

; APPLICANT: Waga, Iwao 

; APPLICANT : Yamamoto, Jun 

TITLE OF INVENTION: Identifying Drugs for and Diagnosis of Benign Prostatic 
; TITLE OF INVENTION: Hyperplasia Using Gene Expression Profiles 
; FILE REFERENCE: 44921-5029-US 
; CURRENT APPLICATION NUMBER: US/09/873, 319A 
; CURRENT FILING DATE: 2001-06-05 
; EARLIER APPLICATION NUMBER: US 60/223,323 
; EARLIER FILING DATE: 2000-08-07 
; NUMBER OF SEQ ID NOS : 755 
; SOFTWARE: Patent In Ver. 2.1 
; SEQ ID NO 707 

LENGTH: 7260 

TYPE: DNA 
; ORGANISM: Homo sapiens 

FEATURE: 

OTHER INFORMATION: Genbank Accession No. US20030134324A1 X57025 
US-09-873-319-707 

Query Match 66.6%; Score 344.2; DB 13; Length 7260; 

Best Local Similarity 87.3%; Pred. No. 1.3e-102; 

Matches 455; Conservative 0; Mismatches 13; Indels 53; Gaps 5; 



Qy 



1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

M I I I I I II I I M M II M M II I I II M M II I I I I I I I I II M I I II M I M I II I M 



Db 



311 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 370 



Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 

Db 371 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 430 

Qy 121 ACAGGCAT CGT GGATGAGT GCT GCTTCCGGAGCT GTGAT CTAAGGAGGCTGGAGATGTAT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 431 ACAGG CAT C GT GGAT GAGT GCT GCT T C C GGAGCT GT GAT CT AAGGAGGCT GGAGAT GTAT 490 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 4 91 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 550 

Qy 241 AT GC C CAAGAC C C AGAAGT AT CAGC C C C CAT CT AC CAACAAGAACACGAAGT CTCAGAGA 300 

I I I I I I I I I I I I I I I 

Db 551 AT G C C CAAGAC C C AG 565 

Qy 301 AGGAAAGGAAGT ACAT T T GAAGAACACAAGT AGAGGGAGT GCAGGAAACAAGAACT ACAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 5 66 AAG GAAGT ACAT T T GAAGAAC GCAAGT AGAGGGAGT GCAGGAAACAAGAACT ACAG 621 

Qy 361 GAT GT A- GAAGAC C CT T CT GAGGAGT GAAGAAGGAC AGGC CACCGCAGGACCCT T T GCT C 419 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 622 GAT GT AGGAAGACC CT C CT GAGGAGT GAAGAGT GAC ATGC CACCGCAGGAT CCTT T GCT C 681 

Qy 420 T GC AC - AGT T AC CT G- T AAAC ATT GGAAT AC CGGC CAAAAAATAAGTTT GATC AC ATT T C 477 

I I I I I I I II I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I II 
Db 682 T GC AC GAGT T AC CT GTTAAACT TT GGAACAC CT AC CAAAAAATAAGTTT GAT AAC ATT T A 741 

Qy 47 8 AAAGAT - GGCAT T T C C C CCAAT GAAAT ACACAAGTAAACAT 517 

I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 742 AAAGAT GGGCGTTTCCCC CAAT GAAAT ACACAAGTAAACAT 782 



RESULT 7 

US-09-960-706-1066 

; Sequence 1066, Application US/09960706 

; Publication No. US20030134280A1 

; GENERAL INFORMATION: 

; APPLICANT: Munger, William E . 

; TITLE OF INVENTION: Identifying Drugs for and Diagnosis of Benign Prostatic 
Hyperplasia Using 

TITLE OF INVENTION: Gene Expression Profiles 
; FILE REFERENCE: 44921-5029-01US 
; CURRENT APPLICATION NUMBER: US/09/960, 706 
; CURRENT FILING DATE: 2001-09-24 
; PRIOR APPLICATION NUMBER: 60/223,323 
; PRIOR FILING DATE: 2000-08-07 
; PRIOR APPLICATION NUMBER: 09/873,319 
; PRIOR FILING DATE: 2001-06-05 
; NUMBER OF SEQ ID NOS : 1124 
; SOFTWARE: Patentln Ver. 2.1 
; SEQ ID NO 1066 
LENGTH: 7260 
; TYPE: DNA 

; ORGANISM: Homo sapiens 



; FEATURE: 

OTHER INFORMATION: Genbank Accession No. US20030134280A1 X57025 
US-09-960-706-1066 

Query Match 66.6%; Score 344.2; DB 13; Length 7260; 

Best Local Similarity 87.3%; Pred. No. 1.3e-102; 

Matches 455; Conservative 0; Mismatches 13; Indels 53; Gaps 5; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I II I I M 
Db 311 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 37 0 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 371 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 430 

Qy 121 AC AGGC AT C GT GGAT GAGT GCT GCT T C C GGAGCT GT GAT CTAAGGAGGCT GGAGAT GT AT 180 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 431 ACAGGC AT C GT GGAT GAGT GCT GCT T C C GGAGCT GT GAT CTAAG GAGGCT GGAGAT GT AT 490 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 491 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 550 

Qy 241 AT GC C CAAGAC C CAGAAGT AT CAGC C C C CAT CT AC CAACAAGAACACGAAGT CT C AGAGA 300 

I I I I I I I I I I I I II I 

Db 551 AT GC CCAAGAC C CAG 565 

Qy 301 AGGAAAGGAAGTACATTTGAAGAACACAAGTAGAGGGAGTGCAGGAAACAAGAACTACAG 360 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 566 AAGGAAGT ACAT TT GAAGAAC GCAAGT AGAGG GAGT GC AGGAAACAAGAACTACAG 621 

Qy 361 GAT GTA- GAAGACCCTTCTGAGGAGT GAAGAAGGACAGGCCACCGCAGGACCCTTT GCT C 419 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 622 GAT GTAGGAAGACCCTCCTGAGGAGT GAAGAGT GACAT GCCACCGCAGGATCCTTTGCTC 681 

Qy 420 T GC AC - AGT T AC CT G- TAAAC ATT GGAAT AC CGGC CAAAAAATAAGTT T GATCAC ATTT C 477 

I I I I I I I I II I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I 
Db 682 T GC AC GAGT T AC CT GTT AAACT T T GGAAC AC CT AC CAAAAAATAAGTT TGATAAC ATT T A 741 

Qy 47 8 AAAGAT - GGCATTTCCCCCAAT GAAAT ACACAAGTAAACAT 517 

I I I I I I III I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 742 AAAGATGGGC GTTTCCCCCAAT GAAAT ACACAAGTAAACAT 782 



RESULT 8 
US-10-136-639-4 

; Sequence 4, Application US/10136639 
; Publication No. US20030072761A1 
; GENERAL INFORMATION: 
; APPLICANT: LeBowitz, Jonathan 

; TITLE OF INVENTION: METHODS AND COMPOSITIONS FOR TARGETING PROTEINS ACROSS 
THE BLOOD BRAIN 

; TITLE OF INVENTION: BARRIER 
; FILE REFERENCE: SYM-008 

; CURRENT APPLICATION NUMBER: US/10/136,639 
; CURRENT FILING DATE: 2002-09-06 



PRIOR APPLICATION NUMBER: US 60/329,650 
PRIOR FILING DATE: 2001-10-16 
NUMBER OF SEQ ID NOS : 4 
SOFTWARE: Patentln version 3.0 
SEQ ID NO 4 
LENGTH: 7260 
TYPE: DNA 

ORGANISM: Homo sapiens 
US-10-136-639-4 

Query Match 66.6%; Score 344.2; DB 15; Length 7260; 

Best Local Similarity 87.3%; Pred. No. 1.3e-102; 

Matches 455; Conservative 0; Mismatches 13; Indels 53; Gaps 5; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 311 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 370 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 371 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 430 

Qy 121 ACAGGCATCGTGGAT GAGT GCT GCTTC CGGAGCTGT GAT CTAAGGAGGCT GGAGAT GTAT 180 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 431 ACAGGCAT CGT GGAT GAGT GCTGCTTCC GGAGCT GT GAT CTAAGGAGGCT GGAGAT GTAT 4 90 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 491 TGCGCACCCCTCAAGCCTGCCT^AGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 550 

Qy 241 AT GCCCAAGAC C C AGAAGT AT CAGC C C C CAT CT AC CAACAAGAACAC GAAGT CT CAGAGA 300 

I I I I I I I I I I I I I I I 

Db 551 AT GC C CAAGAC C C AG 565 

Qy 301 AGGAAAGGAAGT ACAT T T GAAGAAC ACAAGT AGAGGGAGT GC AGGAAACAAGAACT AC AG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 566 AAGGAAGTACATTTGAAGAACGCAAGTAGAGGGAGTGCAGGAAACAAGAACTACAG 621 

Qy 361 GAT GT A- GAAGAC C CT T CT GAGGAGT GAAGAAGGAC AGGC C AC C GC AGGAC C CT T T GCT C 419 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 622 GAT GT AGGAAGAC C CT C CT GAGGAGT GAAGAGT GAC AT GC CAC C GC AGGAT CCTTTGCTC 681 

Qy 420 TGCAC- AGTTAC CT G- TAAACATT GGAAT AC C GGC CAAAAAATAAGT T T GAT C AC AT TT C 477 

I I I I I I I I II I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I MINI 
Db 682 TGCACGAGTTAC CT GT TAAACT T T GGAACAC CT AC CAAAAAATAAGT T T GATAAC ATTT A 741 

Qy 478 AAAGAT - G GCAT T T C C CC CAAT GAAAT AC ACAAGTAAAC AT 517 

I I I I I I III I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 742 AAAGAT GG GC GT T T C C C C CAAT GAAAT ACACAAGTAAAC AT 7 82 



RESULT 9 

US-10-207-655-54 

; Sequence 54, Application US/10207655 

; Publication No. US20030118592A1 

; GENERAL INFORMATION: 

; APPLICANT: Ledbetter, Jeffrey A. 



; APPLICANT : Hayden-Ledbetter, Martha S. 

; TITLE OF INVENTION: BINDING DOMAIN- IMMUNOGLOBULIN FUSION PROTEINS 

; FILE REFERENCE: 390069. 401C1 

; CURRENT APPLICATION NUMBER: US/10/207 , 655 

; CURRENT FILING DATE: 2002-07-25 

; NUMBER OF SEQ ID NOS : 426 

; SOFTWARE: Patentln version 3.0 

; SEQ ID NO 54 

LENGTH: 725 

TYPE: DNA 
; ORGANISM: Homo sapiens 
US-10-207-655-54 

Query Match 66.3%; Score 342.6; DB 15; Length 725; 

Best Local Similarity 87.1%; Pred. No. 1.4e-102; 

Matches 454; Conservative 0; Mismatches 14; Indels 53; Gaps 5; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 156 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 215 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 216 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 275 

Qy 121 AC AGGC AT C GT GGAT GAGT GCTGCTT CC GGAG CT GT GAT C TAAGGAGGCT GGAGAT GT AT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 
Db 276 ACAGGT AT C GT GGAT GAGT GCTGCTT C C GGAG CT GT GAT C TAAGGAGGCT GGAGAT GT AT 335 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 336 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 395 

Qy 241 AT G C C C AAG AC C C AG AAGT AT C AG C C C C CAT C T AC C AAC AAGAAC AC G AAGT CT C AG AG A 300 

I I I I I I I I I I I I I I I 

Db 396 AT G C C C AAG AC C C AG 410 

Qy 301 AGGAAAGGAAGTACATTTGAAGAACACAAGTAGAGGGAGTGCAGGAAACAAGAACTACAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 411 AAGGAAGT AC ATTT GAAGAAC GCAAGT AGAGGGAGT GCAGGAAACAAGAACT ACAG 466 

Qy 361 GAT GT A- GAAGACCCTTCTGAGGAGT GAAGAAGGACAGGCCACCGCAGGACCCTTT GCT C 419 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 467 GATGTAGGAAGACCCTCCTGAGGAGT GAAGAGT GACATGC CACCGCAGGAT CCTTT GCT C 526 

Qy 420 T GC AC - AGT T ACCT G- TAAACATT GGAAT AC C GGC CAAAAAAT AAGT T T GAT CACAT T T C 477 

I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I II 

Db 527 TGCACGAGTTACCTGTTTW^CTTTGGTVACACCTACCAATUWVTAAGTTTGATAACATTTA 58 6 

Qy 478 AAAGAT - GGCATTT CCCCCAAT GAAAT ACACAAGTAAACAT 517 

I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 587 AAAGAT GGGC GTTTCCCCCAAT GAAAT ACACAAGTAAACAT 627 



RESULT 10 
US-09-852-261-3 

; Sequence 3, Application US/09852261 



Patent No. US20020083477A1 
GENERAL INFORMATION: 
APPLICANT: GOLDSPINK, GEOFFREY 
APPLICANT: TERENGHI, GIORGIO 
TITLE OF INVENTION: REPAIR OF NERVE DAMAGE 
FILE REFERENCE: 117-351 

CURRENT APPLICATION NUMBER: US/09/ 852 , 2 61 
CURRENT FILING DATE: 2001-05-10 
PRIOR APPLICATION NUMBER: GB 0011278.9 
PRIOR FILING DATE: 2000-05-10 
NUMBER OF SEQ ID NOS : 14 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 3 
LENGTH: 539 
TYPE: DNA 

ORGANISM: Rattus sp . 
US-09-852-261-3 

Query Match 62.9%; Score 325.2; DB 9; Length 539; 

Best Local Similarity 81.2%; Pred. No. 6.8e-97; 

Matches 429; Conservative 0; Mismatches 88; Indels 11; Gaps 4; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 GGACCAGAGACCCTTTGCGGGGCTGAGCTGGTGGACGCTCTTCAGTTCGTGTGTGGACCA 60 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II III 

Db 61 AGGGGCTTTTACTTCAACAAGCCCACAGTCTATGGCTCCAGCATTCGGAGGGCACCACAG 120 

Qy 121 AC AGGCAT CGTGGATGAGT GCTG CT T C C GGAGCT GT GAT CTAAGGAGGCT GGAGAT GT AT 180 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 ACGGGCATTGTGGATGAGTGTTGCTTCCGGAGCTGTGATCTGAGGAGGCTGGAGATGTAC 180 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 24 0 

III II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 TGTGTCCGCTGCAAGCCTACAAAGTCAGCTCGTTCCATCCGGGCCCAGCGCCACACTGAC 240 

Qy 241 ATGCCCAAGACCCAGAAGTATCAGCCCCCATCTACCAACAAGAACACGAAGTCTCA G 2 97 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 

Db 241 ATGCCCAAGACTCAGAAGTCCCAGCCCCTATCGACACACAAGAAAAGGAAGCTGCAAAGG 300 

Qy 298 AGAAGGAAAGGAAGT ACATTT GAAGAACACAAGT AGAGGGAGT G CAG GAAAC AAGAACT A 357 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 301 AGAAGGAAAGGAAGTACACTTGAAGAACACAAGTAGAGGAAGTGCAGGAAACAAGACCTA 360 

Qy 358 CAGGAT GTA- GAAGACCCTT CT GAGGAGT GAAGAAGGACAGGC CAC C G CAGGAC C CT T T G 416 

III I I II I II II III I I I I I I I II I II I I I I I I II II I I I I I I 

Db 361 CAGAAT GTAGGAGGAGCCTC C C GAGGAAC AGAAAAT GC CAC GT CAC C GCAAGAT C CT T T G 42 0 

Qy 417 CTCTGCACAGTTACCTGTAAACATTGGAATACCGGCCA AAAAAT AAGT T T GAT C 470 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 421 CT GCTT GAGCAACCT GCAAAAC AT C GGAAC AC CT GC CAAAT AT CAAT AAT GAGT T CAAT A 4 80 

Qy 471 ACATTTCAAAGAT-GGCATTTCCCCCAATGAAATACACAAGTAAACAT 517 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 T CAT T T C AGAGAT G G GC AT T T C C CT CAAT GAAAT AC ACAAGT AAAC AT 52 8 



RESULT 11 
US-10-161-088-1 

Sequence 1, Application US/10161088 
Publication No. US20030077761A1 
GENERAL INFORMATION: 
APPLICANT: Parrow, Vendela 
APPLICANT: Rosengren, Linda 
TITLE OF INVENTION: NEW METHODS 
FILE REFERENCE: 13425-111001 
CURRENT APPLICATION NUMBER: US/10/161,088 
CURRENT FILING DATE: 2002-05-31 
PRIOR APPLICATION NUMBER: SE 0101934-8 
PRIOR FILING DATE: 2001-06-01 
NUMBER OF SEQ ID NOS : 3 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 1 
LENGTH: 651 
TYPE: DNA 

ORGANISM: Homo sapiens 
FEATURE : 
NAME/ KEY: CDS 
LOCATION: (73) . . . (471) 
US-10-161-088-1 

Query Match 61.5%; Score 318.2; DB 15; Length 651; 

Best Local Similarity 81.7%; Pred. No. 1.5e-94; 

Matches 419; Conservative 0; Mismatches 83; Indels 11; Gaps 4; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

Mill Mill II I I I I I II I II I II II I I I II II II I I I I I I I II II I II I I I I 
Db 139 GGACCAGAGACCCTTTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGACCG 198 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I M I I I I I I I I I I I I II I II I I II II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 199 AGGGGCTTTTACTTCAACAAGCCCACAGGCTATGGCTCCAGCATTCGGAGGGCACCTCAG 258 

Qy 121 AC AG GC AT C GT GGAT GAGT GCT GCT T C C GGAGCT GT GAT CT AAGGAGGCT GGAGAT GT AT 180 

I II II II I I I M II II II I I I I I I II II I I I I I I II II I II II I I I I I I I I II I I 

Db 259 AC AGGC AT T GT GGAT GAGT GTT GCT T C C GGAG CT GT GAT CT GAGGAGACT GGAGAT GT AC 318 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

II II II II I II I I I I II I II I I I I I II I I I II I II II II I I M II I III 

Db 319 T GT GC C C C ACT GAAGC CTACAAAAGC AGC C C GCT CT AT CC GTGC C C AGC G C CAC ACT GAC 378 

Qy 241 AT GC C CAAGAC C C AGAAGT AT C AG CC CCC AT CT AC CAACAAGAAC AC GAAGT CT CA G 297 

II II I I I I I II II I II I I I I I II I II II I II I I I I I I II II I II I 
Db 379 AT GC C CAAGACT C AGAAGT CC C C GT C C CT AT C GACAAACAAGAAAAC GAAGCT GCAAAGG 438 

Qy 298 AGAAGGAAAGGAAGTACATTT GAAGAACACAAGT AGAGGGAGT GCAGGAAACAAGAACTA 357 

I I I I II II I I I I I I II II II I I I I I II I II I M I I I I I I II II II I I I I I I I I I I III 

Db 439 AGAAGGAAAGGAAGTACATTTGAAGAACACAAGTAGAGGAAGTGCAGGAAACAAGACCTA 498 

Qy 358 CAGGATGT A- GAAGACCCTTCT GAGGAGT GAAGAAGGACAGGCCACCGCAGGACCCTTT G 416 

II I II I I I I I I I I II I I I II I I I I I I. II II II II I I I II I I I 

Db 499 CAGAAT GT AGGAGGAGCCTC C C AC GGAGC AGAAAAT GC CACAT CAC C GCAGGAT C CTT T G 558 



Qy 417 CTCTGCACAGTTACCTGTAAACATTGGAATACCGGCCA AAAAAT AAGT TT GAT C 470 

II I I I I I II I I I I II I I I I I I I I I I I I II 

Db 559 CT G CT T GAGCAAC CT GCAAAAC AT C GAAACACCT AC CAAAT AACAATAAT AAGT C CAAT A 618 

Qy 471 ACAT T T CAAAGAT - GGC AT T T C C C C CAATGAAA 502 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 619 ACAT T ACAAAGAT GGGC AT T T C C C C CAATGAAA 651 



RESULT 12 
US-10-251-661-7 

Sequence 7, Application US/10251661 
Publication No. US20030166555A1 
GENERAL INFORMATION : 
APPLICANT: Alberini, Cristina M. 
APPLICANT: Bear, Mark F. 

TITLE OF INVENTION: Methods and Compositions for Regulating 
TITLE OF INVENTION: Memory Consolidation 
FILE REFERENCE: 3499.1001-003 
CURRENT APPLICATION NUMBER: US/10/251,661 
CURRENT FILING DATE: 2 002-09-2 0 
PRIOR APPLICATION NUMBER: 60/193,614 
PRIOR FILING DATE: 2000-03-31 
PRIOR APPLICATION NUMBER: PCT/US01/10661 
PRIOR FILING DATE: 2001-04-02 
NUMBER OF SEQ ID NOS : 12 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 7 
LENGTH: 612 
TYPE: DNA 

ORGANISM: Homo sapiens 
FEATURE : 
NAME/ KEY: CDS 
LOCATION: ( 151) . . . (564) 
US-10-251-661-7 

Query Match 55.2%; Score 285.4; DB 13; Length 612; 

Best Local Similarity 86.5%; Pred. No. l.le-83; 

Matches 359; Conservative 0; Mismatches 6; Indels 50; Gaps 2; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 247 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 306 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 

Db 307 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 366 

Qy 121 ACAGGCAT CGT GGATGAGT GCT GCTT CCGGAGCTGTGAT CTAAGGAGGCT GGAGAT GTAT 180 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I M I I I I I I I I I I M I I II I I I I I I I I I 
Db 367 ACAGGCAT C GT GGAT GAGT GCT GCTT C C GGAGCTGT GAT CTAAGGAGGCT GGAGAT GTAT 42 6 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

I I I I II I I II II I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 427 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 4 86 



Qy 241 ATGCCCAAGACCCAGAAGTAT CAGCCCCCATCTAC CAACAAGAACACGAAGT CT CAGAGA 300 

I I I I I I I I I I I I I I I 

Db 4 87 ATGCCCAAGACCCAG 501 

Qy 301 AGGAAAGGAAGTACATTT GAAGAACACAAGT AGAGGGAGT GCAGGAAACAAGAACTACAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 502 AAGGAAGTACATTT GAAGAACGCAAGT AGAGGGAGT GCAGGAAACAAGAACTACAG 557 

Qy 361 GAT GT A- GAAGACCCTT CT GAGGAGT GAAGAAGGAC AGGC C AC CGCAGGAC C CT T 414 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
Db 558 GAT GTAGGAAGACCCTCCT GAGGAGT GAAGAGT GACAT GCCACCGCAGGAT CCTT 612 



RESULT 13 
US-09-852-261-9 

Sequence 9, Application US/09852261 
Patent No. US20020083477A1 
GENERAL INFORMATION: 
APPLICANT: GOLDSPINK, GEOFFREY 
APPLICANT: TERENGHI, GIORGIO 
TITLE OF INVENTION: REPAIR OF NERVE DAMAGE 
FILE REFERENCE: 117-351 

CURRENT APPLICATION NUMBER: US/09/852,261 
CURRENT FILING DATE: 2001-05-10 
PRIOR APPLICATION NUMBER: GB 0011278.9 
PRIOR FILING DATE: 2000-05-10 
NUMBER OF SEQ ID NOS: 14 
SOFTWARE: Patent In Ver. 2.1 
SEQ ID NO 9 
LENGTH: 318 
TYPE: DNA 

ORGANISM: Homo sapiens 
US-09-852-261-9 

Query Match 50.0%; Score 258.4; DB 9; Length 318; 

Best Local Similarity 99.6%; Pred. No. 6.6e-75; 

Matches 259; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 12 0 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 12 0 

Qy 121 ACAGGC AT CGT GGAT GAGT GCT GCT T C C GGAGCT GT GAT CTAAGGAGGCT G GAGAT GT AT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 AC AGGCAT CGTGGAT GAGT GCT GCT T C C GGAG CT GT GAT CTAAGGAGGCT GGAGAT GT AT 180 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

Qy 241 AT GCCCAAGACCCAGAAGTA 2 60 

I I I I I I I I I I I I I I I I I I I 

Db 241 AT GCC CAAGACCCAGAAG GA 2 60 



RESULT 14 
US-09-852-261-11 

Sequence 11, Application US/09852261 
Patent No. US20020083477A1 
GENERAL INFORMATION: 
APPLICANT: GOLDSPINK, GEOFFREY 
APPLICANT: TERENGHI, GIORGIO 
TITLE OF INVENTION: REPAIR OF NERVE DAMAGE 
FILE REFERENCE: 117-351 

CURRENT APPLICATION NUMBER: US/09/852 , 261 
CURRENT FILING DATE: 2001-05-10 
PRIOR APPLICATION NUMBER: GB 0011278.9 
PRIOR FILING DATE: 2000-05-10 
NUMBER OF SEQ ID NOS : 14 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 11 
LENGTH: 487 
TYPE: DNA 

ORGANISM: Rattus sp . 
US-09-852-261-11 

Query Match 47.9%; Score 247.8; DB 9; Length 487; 

Best Local Similarity 74.5%; Pred. No. 2.6e-71; 

Matches 391; Conservative 0; Mismatches 77; Indels 57; Gaps 4; 

Qy 1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I Mill II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 GGACCAGAGACCCTTTGCGGGGCTGAGCTGGTGGACGCTCTTCAGTTCGTGTGTGGACCA 60 

Qy 61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II III 

Db 61 AGGGGCTT T T ACTT CAACAAGC C CACAGT CT AT GGCTC CAGCAT T CGGAGGGCACCAC AG 120 

Qy 121 ACAGGCAT C GT GGAT GAGT GCT GCT T CC GGAGCT GT GAT CT AAGGAGGCT GGAGATGT AT 180 

II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I 

Db 121 ACG GGCAT T GT GGAT GAGT GT T GCT T CC GGAGCT GTGAT CT GAGGAGGCT GGAGATGT AC 180 

Qy 181 TGCGCACCCCTCAAGCCTGCCAAGTCAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGAC 240 

III II I I I I I I I I I I II I I I I I I I M I I I I I I I I I I I I I I I I I I I I I 

Db 181 TGTGTCCGCTGCAAGCCTACAAAGTCAGCTCGTTCCATCCGGGCCCAGCGCCACACTGAC 240 

Qy 241 ATGCCCAAGACCCAGAAGTAT CAGCCCCCATCTACCAACAAGAACACGAAGT CTCAGAGA 300 

I I I I I I I I I I I III 

Db 241 ATGCCCAAGACTCAG 255 

Qy 301 AGGAAAGGAAGTACATTTGAAGAACACAAGTAGAGGGAGTGCAGGAAACAAGAACTACAG 3 60 

I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I M I I I I I I I I I I MINI 
Db 256 AAGGAAGT ACACTT GAAGAACACAAGT AGAGGAAGT GC AGGAAACAAGAC CT AC AG 311 

Qy 361 GAT GT A- GAAGAC C CTT CT GAGGAGT GAAGAAGGACAGGC CAC C GC AGGAC C CT TT GCT C 419 

II I I I II II III I I I I I I I II I II I I I I I I II II I I I I I II I 

Db 312 AAT GT AGGAGGAG C CT CC C GAGGAACAGAAAAT GC CAC GT CAC C GCAAGAT C CT TT GCT G 371 

Qy 420 T GCAC AGT T AC CT GT AAAC AT T GGAAT AC C GGC CA AAAAATAAGTTTGATCACA 473 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II II 



Db 372 CTT GAGCAACCTGCAAAACATCGGAACACCTGCCAAATAT CAATAAT GAGTTCAATAT CA 431 

Qy 474 TTTCAAAGAT-GGCATTTCCCCCAATGAAATACACAAGTAAACAT 517 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 432 TTTCAGAGATGGGCATTTCCCTCAATGAAATACACAAGTAAACAT 476 



RESULT 15 
US-10-238-114-1 

; Sequence 1, Application US/10238114 
; Publication No. US20030100073A1 
; GENERAL INFORMATION: 
; APPLICANT: Merial 

; APPLICANT: ANDREONI , Christine Michele 

; TITLE OF INVENTION: IGF-1 AS FELINE VACCINE ADJUVANT , IN PARTICULAR AGAINST 
FELINE RETROVIRUS 

; FILE REFERENCE: 454313-3165.1 

; CURRENT APPLICATION NUMBER: US/10/238 , 114 

; CURRENT FILING DATE: 2002-09-10 

; PRIOR APPLICATION NUMBER: FR 01 11736 

; PRIOR FILING DATE: 2001-09-11 

; PRIOR APPLICATION NUMBER: US 60/318,666 

; PRIOR FILING DATE: 2001-09-12 

; NUMBER OF SEQ ID NOS : 20 

; SOFTWARE: Patentln version 3.1 

; SEQ ID NO 1 

LENGTH: 462 
; TYPE : DNA 

ORGANISM: Felis catus 
US-10-238-114-1 

Query Match 44.1%; Score 228; DB 15; Length 462; 

Best Local Similarity 92.3%; Pred. No. 9e-65; 

Matches 240; Conservative 0; Mismatches 20; Indels 0; Gaps 0, 

1 GGACCGGAGACGCTCTGCGGGGCTGAGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGAC 60 

I I I I I I I I I I I I M I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I 

145 GGACCAGAGACGCT CT GTGGGGCTGAGTT GGTGGACGCT CTTCAGTTCGT GTGTGGAGAC 204 

61 AGGGGCTTTTATTTCAACAAGCCCACAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I M I I I M I I I I I I I I I I I I I I I 
205 AGGGGTTTTTATTTCAACAAGCCCACGGGGTATGGCTCCAGCAGTCGGAGGGCACCTCAG 264 

121 AC AGGC AT CGT GGAT GAGT GCT GCT T C C GGAGCT GTGAT CTAAGGAGGCT GGAGAT GT AT 180 

I I I I I I M I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I 
265 AC AGGCAT C GT GGAT GAGT GCTGCTTCC GGAGCT GT GAT CT GAGGC GGCT AGAGAT GT AC 324 

181 T GC GCACC C CT CAAGCCT GC CAAGT C AGCT CG CT CT GT C C GT GC C C AGC GC CACAC C GAC 240 

M I I I I I I I I I I I I II I I II I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I III 
325 TGTGCACCCCTCAAGCCTGCCAAGTCTGCCCGCTCAGTCCGTGCTCAGCGCCACACTGAC 384 

241 AT GCC CAAGAC C C AGAAGT A 260 

I I I I I I I I I I I I I I I I I 
385 AT GCCCAAGGCT CAGAAGGA 404 

Search completed: December 13, 2003, 11:56:45 
Job time : 232.833 sees 
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